CANBOGA

Tracking Your Visitors

Minggu, 29 April 2012

Tracking Your Web Visitors

So you've created the ultimate Web site, and now you're sitting back watching your hit counter go wild. You may ask yourself, "I wonder how many pageviews my help page is getting?" or, "I wonder how many people are visiting my site?" Unfortunately, when most people start building a Web site, they don't consider they someday might want to track its traffic. It takes enough time just to design the site and create the content. Outlining what information they want to track is just more work that already overworked staffs tend to let slide.

But when it comes down to it, we all quickly become bean counters on the Web. Once a site is up and running, we want to know how many people are looking at our pages and how many pages each of those people is looking at. That's usually when a lot of Web developers discover that had they spent more time thinking about setting up their site, they'd be able to track how it's being used much more easily.

If you're in this situation right now, you've come to the right place. And if you haven't made your site public yet, you're lucky - you still have time to think about reporting before your design is set in stone. Don't miss out on this chance!

What Information Is Available?

Before you can decide what type of analysis you want to do, you need to know what information is available. Unfortunately, there's not much tracking data you can collect, and what you can get is unreliable. But don't despair - you can still gain useful knowledge from what does exist.

Your Web servers can record information about every request they get. The information available to you for each request includes:

· Date and time of the hit (we'll look more closely at what hits are later on)

· Name of the host

· Request

· Visitor's login name (if the user is authenticated)

· Web server's response code for definitions.

· Referer

· Visitor's user agent

· Visitor's IP address

· Visitor's host (if the visitor's IP address can be translated)

· Bytes transferred

· Path of the file served

· Cookies sent by the visitor

· Cookies sent by the Web server

Inaccurate, But Not Useless

As I mentioned before, the information you have available is inaccurate but not completely unreliable. Although this data is inexact, you can still use it to gain a better understanding of how people use your site.

To start things off, let's take the 10,000-foot view of everything available and then drop slowly toward the details. So, first let's talk about hits and pageviews. (If you didn't know already - there is a difference. A hit is any request for a file your server receives. That includes images, sound files, and anything else that may appear on a page. A pageview is a little more accurate because it counts a page as a whole - not all its parts.)

As you probably already know, it's quite easy to find out how many hits you're getting with a simple hit counter, but for more precise analysis, you're going to have to store the information about the hits you get. An easy way to do this is simply to save the information in your Web server log files and periodically load database tables with that data or to write the information directly to database tables. (For those database-savvy readers, if you periodically load database tables using a 3GL and ODBC- or RDBMS-dependent APIs, you can use data-loading tools from the RDBMS vendor - such as Sybase's BCP - or you can use a third-party, data-loading product. Here is a partial list of products.)

If you load your data directly into a database, you will either need a Web server with the capability already implemented (such as Microsoft's IIS), or you will need the source code for the server. Another option is to use a third-party API, like Apache's DBILogger. Once you do that, you can gather information about how many failed hits you're getting - just count the number of hits with a status code in the 400s. And if you're curious, you can drill down farther by grouping by each status code separately.

Pageviews

On the whole, though, counting hits isn't as informative as counting pageviews. And the results aren't comparable to those of other sites (see the Internet Advertising Bureau's industry-standard metrics). To count pageviews, you need to devise some method of differentiating hits that are pageviews from those that are not. Here are some of the factors we take into account when doing this at Wired Digital:

· Name of the file served

· Type of the file served (HTML, GIF, WAV, and so on)

· Web server's response code (for instance, we never count failed requests - those with a status code in the 400s)

· Visitor's host (we don't count pageviews generated by Wired employees)

Once you've determined which hits are pageviews and which are not, you can count the number of pageviews your site gets. But you'll probably want to drill down in your data eventually to determine how many pageviews each of your pages gets individually. Furthermore, if you split your site into channels or sections - we separate our content into HotBot, HotWired, Wired News, and Suck - you may want to determine how many pageviews each area gets. This is where standards for site design can help.

If this standard is in place at all levels of your site, you can summarize and drill down through your pageviews at will. Of course, there are some problems with this method. You may want to count a pageview in one section part of the time and in another section at other times. There are ways (that I won't go into now), however, to get around these problems. We've found over the years that this method works best - at least for us.

Looking Deeper Into Pageviews

Once you've cut your teeth on some programs designed to retrieve the types of information I've just explained, you should be able to use your knowledge to code programs to give you the following:

· Pageviews by time bucket You can look at how pageviews change every five minutes for a day. This will tell you when people are accessing your site. If you also split group pageviews by your visitors' root domains, you can determine whether people visit your site before work hours, during work, or after work.

· Pageviews by logged-in visitors vs. pageviews by visitors who haven't logged in What percentage of your pageviews come from logged-in visitors? This information can help you determine whether allowing people to log in is worthwhile. You can also get some indicat ion of how your site might perform if you required visitors to log in.

· Pageviews by referrer When your visitors come to one of your pages via a link or banner, where do they come from? This information can help you determine your visitors' interests (you'll know what other sites they visit). And if you advertise, this information can help you decide where to put your advertising dollars. It can also help you decide more intelligently which sites you want to partner with - if you're considering such an endeavor.

· Pageviews by visitor hardware platform, operating system, browser, and/or browser version What percentage of your pageviews come from visitors using Macs? Using PCs? From visitors using Netscape? Internet Explorer? It will take a bit of work to cull this information out of the user agent string, but it can be done. Oh, and since browsers are continually being created and updated, and therefore the number of possible values in the user agent string continues to grow larger, you'll have to keep up to date on whatever method you use to parse this information.

· Pageviews by visitors' host How many of your pageviews come from visitors using AOL? Earthlink?

Note that you may want to mix and match these various dimensions. For example, how do your referrals change over time? Does the relative percentage of Netscape users vs. Internet Explorer users change over the course of the day? Does one area of your site seem to interest Unix users more than other areas?

How To Count Unique Visitors

Now let's talk about visitor information. Look at the bulleted paragraphs above and replace the word "pageviews" with the word "visitors." Interesting, huh? Unfortunately, counting visitors is more difficult than counting pageviews.

First off, let's get one thing out in the open: There is absolutely no way to count visitors reliably. Until Big Brother ties people to their computers and those computers scan their retinas or fingerprints to supply you with this information, you'll never be sure who's visiting your site. Basically, there are three types of information you can utilize to track visitors: their IP addresses, their member names (if your site uses membership), and their cookies.

The most readily available piece of information is the visitor's IP address. To count visitors, you simply count the number of unique IP addresses in your logs. Unfortunately, easiest isn't always best. This method is the most inaccurate one available to you. Most people connecting to the Net get a different IP address every time they connect. That's because ISPs and organizations like AOL assign addresses dynamically in order to use the limited block of IP addresses given to them more efficiently. When an AOL customer connects, AOL assigns them an IP address. And when they disconnect, AOL makes that IP address available to another customer.

For example, Sue connects via AOL at 8 a.m. and is given the IP address 152.163.199.42, visits your site, and disconnects. At 10 a.m., Bob connects via AOL and is assigned the same IP address. He visits your site and then disconnects. Later, as you're tallying the unique IP addresses in your logs, you'll unknowingly count Sue and Bob as one visitor. This method becomes increasingly inaccurate if you're examining data over longer time periods. We only use this information in our calculations at Wired Digital as a last resort, and then only when we're looking at a single day's worth of data.

If you allow people to log in to your site through membership, you have another piece of information available to you. If you require people to log in, visitor tracking becomes much easier. And if you require people to enter their passwords each time they log in, you're in tracking heaven. As we all know, though, there's a downside to making people log in - namely that a lot of people don't like the process and won't come to your site if you require it.

If you do force people to log in, however, you can count the number of unique member names and easily determine how many people visit your site. If you don't force people to log in, but do give them the option to do so, you can count the number of unique member names; then, for those hits without member names attached, you can count the number of unique IP addresses instead.

Lastly, you can add cookies to your arsenal. Define a cookie that will have a unique value for every visitor. Let's call it a machine ID (I'll explain this later). If a person visits you without providing you with a machine ID (either because she hasn't visited your site before or because she's set her browser not to accept cookies), calculate a new value and send a cookie along with the page she requested. So now you can count the number of unique machine IDs in your log. But there are still a couple of issues that we need to discuss. First, as I've already mentioned, many people turn off their cookies, so you can't rely on cookies alone to count your visitors. At Wired Digital, we use a combination of cookies, member names, and IP addresses to count visitors, with the caveat that, as I said earlier, we don't use IP addresses when counting more than a single day's traffic.

Second, the cookie specification allows browsers to delete old cookies. And even if this option wasn't specified, a user's hard disk can always fill up. Either way, the cookies you send to a visitor may be removed at some point. So it's possible that a person who visits your site at 8 a.m will no longer have your cookie when they return at 9 a.m. Third, when your Web server sends a cookie to a visitor, it's stored on the visitor's machine - so if a person visits your site from home in the morning using her desktop machine and visits again from work using another PC, you'll log two different cookies. Which is why I've called the cookie a "machine ID": it's tied to the machine, not the visitor.

Which brings us to issue number four: Multiple people may use the same machine, in which case you'll see only one cookie for all of them. Fifth, various proxy servers may handle cookies differently. It's possible that a given proxy server won't deliver cookies to the user's machine. Or it might not deliver the correct cookie to the user's machine (it might even deliver some other cookie from its cache). Or it might not send the user's cookie back to your Web server. Unfortunately, proxy servers are still young. There is no formal and complete standard for how they're supposed to work, and there's no certification service to ensure that they'll do what they're supposed to do. So with all these issues to consider, here's what we do at Wired Digital:

· If we want to count visitors for one day, we count member names.

· For hits that don't have member names, we count cookies.

· For hits that have neither member names or cookies, we count IP addresses.

And if we want to count visitors over multiple days, we only use cookies. We do some statistical analysis in an attempt to determine how much of an undercount results - but in the end, all these calculations are only estimates.

There's one more issue we need to discuss. Do you want to track the information you have over multiple days? Or is one day's worth enough? If one day's data will suffice, you can get away with simple programs that process your log files. If you prefer to process multiple days' information, however, you'll want to store it all in a database.

Long Distance Data Tracking

In my last article, I introduced the types of tracking information you can get from your Web server. Back then, I concentrated mostly on what you can do with a single day's worth of data. Now I'm going to show you what long-range data tracking can do for you. Some questions can only be answered by looking at your data over an extended period of time:

· How fast is my number of pageviews increasing? How many pageviews should I expect by the end of the year?

· Which areas of my site are experiencing the fastest pageview growth? The slowest?

· How is the relative browser share changing over time?

· How often do people visit my site?

· Of the people who first came to my site via my ad banner on xyz.com, how many pages have they subsequently viewed?

And I'm sure that once you look at the types of information available (discussed in my previous article), you'll come up with all sorts of questions that need long-range answers. If you're interested in answering these questions, then multi-day tracking is for you. And if you're thinking of tracking, then it's time to seriously consider a database.

Getting Down to Database-ics

You could create from-scratch programs to retrieve the information you want out of your hit logs. Of course you could also spend your life banging your head against a wall. But neither option is really in your best interest. And the more hits you get per day, the more you'll find good reasons to store your hits in a database:

· If you design your database correctly, your queries will return the information you want many times faster than programs that retrieve data from log files. And the more data you have, the more you'll notice the difference in performance.

· If you only store the hits that interest you (versus every single li'l ol' image request), you can significantly reduce the amount of space your data requires.

· Most people use SQL (Structured Query Language) to retrieve data from databases. SQL is a small, concise language with very few commands and syntax elements to learn. Plus, the command structures are simple and well defined, so good programmers can create an SQL query much more quickly than they could code a program to do the same thing. And the resulting SQL query would be less prone to errors and easier to understand.

· If you don't want to code SQL, you can use a database access tool (e.g., MS Access or Excel, Crystal Reports, or BusinessObjects) to retrieve information. Many of these tools are extremely easy to use, with a graphical, drag-and-drop interface.

· You could also create your own program using one of a smorgasbord of application development tools that make creating a data-retrieving program relatively simple. Check out DM Review for a list of tools. Of course it's nice to know that, with most database products, you aren't prevented from writing your applications in your favorite 3GL. Many provide ODBC access as well as proprietary APIs. For example, at Wired Digital we've written our reporting application in Perl, using both Sybase's CTlib and the DBI package for database access.

On the other hand, some distinct reasons exist NOT to store your data in a database:

· You actually have to implement and maintain the code for loading your data into the database.

· Most databases require some resources for administration.

· Most database products cost money.

· You will have to learn SQL, or whatever language the database product you select implements.

· Databases are inherently more fragile than flat files. You will have to spend more time making sure you have a good "backup and restore" plan.

Still interested in a database? Now you have to choose: 1) whether to load your hits directly into a database from your Web server, and 2) which database product to load your hits into. Note that these decisions aren't independent - it may be difficult, if not impossible, to load hits into some databases, and some databases may not allow data inserts while queries are being run against them.

The Direct Route

Loading your data directly from your Web server into a database can add all sorts of complexity to your life. If you choose this route, you have to decide whether you can live with lost data. If you can, you may skip the next few paragraphs. Otherwise, read on. For reasons I won't go into here, higher-end database products use database managers that handle all accesses to the database. Since database managers are software programs, they can fail. So if you have your Web server load its data directly into one of these databases, and the database manager crashes, you may lose this information.

Some Web servers allow you to write code that stores the Web server's information in a log file if the database manager crashes (especially if you have the source code). Of course, in this case you will also have to design a backup process that gets information into your database for those times when your database goes down.

Pick a Database Management System

Here is a partial list of the database products available to you:

Company	Product	Comments
IBM	DB2	Never count IBM out.
Informix	Dynamic Server	Recent company financial problems, but a top-notch RDBMS.
	MSQL	Shareware! Created by David J. Hughes at Bond University, Australia.
Microsoft	Access	Low-end, user-friendly RDBMS.
Microsoft	SQL Server	Mid-range RDBMS. Microsoft's tenacity continues to improve this product.
NCR	Teradata	The Ferrari Testarossa of data warehousing engines ... at Testarossa prices. For very large databases.
Oracle	Oracle8	The leading RDBMS.
Red Brick Systems	Red Brick	RDBMS designed specifically for data warehousing. This is what we use at Wired Digital.
Sybase	Adaptive Server	Number 2 in RDBMS market. We use this at Wired Digital for non-data warehouse applications.

After selecting a database product, you have to design the structure where your data will live. Luckily, your job will be easier than most database designers' because, in the case of Web tracking, there aren't that many different types of information to store. Here are some goals to shoot for when you design your database:

· minimize load times

· minimize query times

· minimize administration and maintenance

· minimize database size

To achieve these goals, all sorts of decisions need to be made. For example, the time it takes to load your data will depend on how much data you want to load, whether you use "lookup" tables, whether your database is stored on a RAID system, and so on. Also, these goals sometimes conflict. For example, to minimize query time, you may have to create and maintain summary tables. But if you do this, administration and maintenance time increases, and the size of your database grows. And as you make these database decisions, don't forget that people who look at your data will, at some point, want to audit and compare it with the data in your Web server log files.

Finally, if you have experience designing data warehouses, do a clean boot of your brain. This will be unlike any other data warehouse you have designed. For example, a merchandiser like Wal-Mart knows what products it sells and at which stores it sells them. For each product, it knows what category it belongs to, who manufactures it, and what it costs. For each store it knows which geographic region it's in, what country it's in, and its size. All of these "dimensions" are limited in the number of values they can have: when a merchandiser loads sales data into its data warehouse, it doesn't have to deal with unknown entities.

Your tracking data warehouse application, however, will constantly deal with unknowns. You don't know what domains visitors will be coming from, where referrals will be coming from, or what browsers those visitors will be using. And when your users enter information into forms, you may not know what values they'll be entering (especially if your forms contain text fields). And there's no telling how many values these "dimensions" will have. So pick your tools wisely, and get tracking.

Troubles with Tracking

My last two articles discussed tracking: The first covered what you can track, and the second dealt with how you can track over time. In this article, I'm going to show you what you can't do by thoroughly demoralizing you with some of the limitations of your available information. No, I'm not a sadist, but it's best that you know what problems you'll be facing, as well as some possible work-arounds. So now, when your boss or a customer asks you why you can't give them exact information, you can point them to this article.

Counting Pageviews

The number of pageviews you count is not the actual number of pageviews of your site. "How can this be?" you ask. "I'm simply counting records in my Web server's access log." Well, the fact is a lot of requests never make it to your access log.

First, browsers - at least Netscape and Internet Explorer - have caches. If a person requests a page from your site and soon requests it again, the browser may not go back to your server to request the page a second time. Instead, it may simply retrieve it from its cache. And you would never know. You can try using "expires" or "no cache" tags to stop browsers from caching your pages, but you can never be sure if your tags are read or not.

Second, let's say that a user's browser doesn't retrieve your page from its cache but actually re-requests the page from your server. Many ISPs use proxy servers, and proxy servers cache pages just like browsers. If a person using an ISP with a proxy server makes a request, the proxy server first checks its cache. If the page is there, it serves that page to the person, instead of going to your server. And you would never know.

Again, you can try using the tags I've described above, but there's no Proxy Server Police making sure proxy servers respect your tags.

Another tracking obstacle are bots, or spiders. These software programs scour the Web, either cataloging pages for search engines or looking for information for their owners.

Do you care if your pageview counts include hits from bots? If you do care, then you'd better find a way to ignore these hits. You can create a list of IP addresses to ignore, but with new bots born every day, the list will always be one step - or 100 steps - behind. Similarly, you can use the requester's user-agent string, but there's nothing keeping developers from sending any old string they please. Lastly, you can take a daily count of the hits and just ignore repeat hits from the same IP address if their total number passes some threshold.

Then you run the risk of accidentally ignoring hits from an ISP that uses a proxy server and sends its own IP address - instead of a different IP address for each user. With no perfect solutions, it's up to you to decide which method you can learn to live with.

Counting Visitors

So, if you're not able to accurately record every single request, of course you can't get a full count of your site's visitors. And that's not your only problem.

I discussed some tracking issues before. One problem I didn't discuss concerns cookies and new visitors. Let's say that you want to count the number of visitors you had yesterday, and you use the methodology we discussed previously.

When a person visits your site for the first time, they don't yet have a cookie, and their request will arrive without one. Your Web server promptly sends the visitor a new cookie along with the requested page. Now, say the visitor then requests a second page from your site. And this time the visitor's request does come with a cookie, so the record of the visitor's hit will have a cookie.

When you use your Perl script (or whatever) to count visitors, you first count membernames, if you allow people to authenticate. For hits that don't have membernames, you count cookies. Lastly, for hits that don't have membernames or cookies, you count remote IP addresses. But this process double-counts new visitors. A visitor's first hit won't have a cookie or membername, and so its IP address will be counted. The same visitor's following hits will be counted either with the count of membernames or cookies.

At Wired Digital, we handle this by logging the times a cookie is sent, yet we don't receive a cookie. Every night, we look for hits that contain a cookie sent. For each one, we check for other hits with a received cookie equal to that sent cookie. If we find any, we move the cookie-sent value into the cookie-received field before we load the hit into our data warehouse. When we use our counting methodology, this person will be counted just once. Note that we don't simply merge the cookies sent with the cookies received. Doing this would multi-count people who have disabled cookies.

OK, let's say you have more than one domain out of which you serve hits: for example, uvw.com and xyz.com. You can count the number of visitors who go to uvw.com, and you can count the number who visited xyz.com, but the total number will almost certainly not equal the sum of the two.

Why can't you get this number? Let's say a visitor comes to uvw.com. The visitor doesn't have a cookie, so your Web server sends one. Let's say the visitor then goes to xyz.com. The visitor's browser won't send the uvw.com cookie to the xyz.com Web server. That's verboten (see Marc's article, "That's the Way the Cookie Crumbles"). Therefore, xyz.com's Web server sends yet another cookie to the visitor, making a total of two different cookies for one visitor. And never the twain shall meet.

How do you get around this problem? As a tracking guy, I do my best to push for one primary domain. For example, uvw.common.com and xyz.common.com. This allows you to use one set of cookies. If you can't make that happen, you've got some work ahead of you. I'm afraid I can't go into our methodology at Wired Digital (if I told you, I'd have to kill you.... yada, yada, yada), but there are ways to get around this limitation. I'll have to leave this as a take-home exercise.

Bots can also wreak havoc in this situation. If one or more bots hit you, your visitor numbers won't be affected much. But if you calculate pageviews per visitor and you ignore bots, your numbers may be skewed.

Tracking Browsers and Platforms

A browser can send your Web server any user-agent string it wants, so whatever reporting you do based on these numbers is a matter of trust. Given that the vast majority of people use Netscape or Internet Explorer, you can feel pretty confident about these numbers. Of course, if one browser cache is better than another's, the number of pageviews you see from the former will be lower than the latter. I probably shouldn't have mentioned that: You know there's a marketing wiz at one of these companies who is asking the development team right now to turn off the browser's caching capability.

Calculating Visits/Sessions

Marketers and advertisers love the concept of the visit, i.e., how long a person stays at a site before moving on. Yet this number is impossible to determine using HTTP. Let's say I request a page from HotBot at noon. Then I request another page from HotBot at 12:19 p.m. How long was my HotBot visit? You can never know for sure. It's possible that I stared at the first HotBot page for the full 19 minutes. But I may just as easily have opened another browser window and read Wired News for the duration of those 19 minutes. Then again, I may have walked to 7-Eleven for a Big Gulp.

Yet your customers demand this information. So, what do you tell them?

Well, you turn to the Internet Advertising Bureau, which defines a visit as "a series of page requests by a visitor without 30 consecutive minutes of inactivity." When people ask about the length of your users' visits, go ahead and tell them, based on the IAB's definition. If you feel like wasting a little time, tell them how the numbers are meaningless until your face turns blue.

Counting Referrals

If a visitor clicks on a link or a banner to get to your site, the visitor's browser will send the URL of the site he or she just left, along with the request. This URL is called the "referer." In a successful attempt to make our lives more difficult, Netscape and Microsoft coded their browsers to handle the passing of referral information differently. Specifically, if you click on a link that takes you to a page that features frames, your Netscape browser will send the original page as the referer to the frame-set page, as well as the pages that make up each individual frame. Internet Explorer will send the original page as the referer to the outer (frame set) page, which in turn sends the URL of the outer page as the referer to the individual frames. Check it out for yourself, and see what a difference a browser makes. This example is made up of the following files:

· referer.html

What does this mean? Basically, if your site features frames and you want to track your referrals to a specific frame, you will have to handle each browser differently.

Wrap-Up

Are you thoroughly frustrated? If not, I admire your bright-and-sunny outlook - you should look into becoming an air-traffic controller; you'd be perfect. Otherwise, I want to remind you that even if every single piece of the tracking puzzle is a nightmare of confusion, you can assemble a picture of your site traffic. It won't be perfect - far from it - but it will provide you with enough information to get an idea of how you're doing and how you can build a better site.

"Tracking Your Web Visitors"

Read Post | komentar

Computer, Technology

Digital Subscriber Line

Background

Digital Subscriber Line (DSL) technology is a modem technology that uses existing twisted-pair telephone lines to transport high-bandwidth data, such as multimedia and video, to service subscribers. The term xDSL covers a number of similar yet competing forms of DSL, including ADSL, SDSL, HDSL, RADSL, and VDSL. xDSL is drawing significant attention from implementers and service providers because it promises to deliver high-bandwidth data rates to dispersed locations with relatively small changes to the existing telco infrastructure. xDSL services are dedicated, point-to-point, public network access over twisted-pair copper wire on the local loop (“last mile”) between a network service provider (NSP’s) central office and the customer site, or on local loops created either intra-building or intra-campus. Currently the primary focus in xDSL is the development and deployment of ADSL and VDSL technologies and architectures. This chapter covers the characteristics and operations of ADSL and VDSL.

Asymmetric Digital Subscriber Line (ADSL)

ADSL technology is asymmetric. It allows more bandwidth downstream—from an NSP’s central office to the customer site—than upstream from the subscriber to the central office. This asymmetry, combined with always-on access (which eliminates call setup), makes ADSL ideal for Internet/intranet surfing, video-on-demand, and remote LAN access. Users of these applications typically download much more information than they send. ADSL transmits more than 6 Mbps to a subscriber, and as much as 640 kbps more in both directions (shown in Figure 15-1). Such rates expand existing access capacity by a factor of 50 or more without new cabling. ADSL can literally transform the existing public information network from one limited

to voice, text, and low-resolution graphics to a powerful, ubiquitous system capable of bringing multimedia, including full motion video, to every home this century.

Asymmetric Digital Subscriber Line (ADSL)

Digital Subscriber Line (DSL) technology

ADSL will play a crucial role over the next decade or more as telephone companies enter new markets for delivering information in video and multimedia formats. New broadband cabling will take decades to reach all prospective subscribers. Success of these new services will depend on reaching as many subscribers as possible during the first few years. By bringing movies, television, video catalogs, remote CD-ROMs, corporate LANs, and the Internet into homes and small businesses, ADSL will make these markets viable and profitable for telephone companies and application suppliers alike.

ADSL Capabilities

An ADSL circuit connects an ADSL modem on each end of a twisted-pair telephone line, creating three information channels—a high-speed downstream channel, a medium-speed duplex channel, and a basic telephone service channel. The basic telephone service channel is split off from the digital modem by filters, thus guaranteeing uninterrupted basic telephone service, even if ADSL fails. The high-speed channel ranges from 1.5 to 6.1 Mbps, and duplex rates range from 16 to 640 kbps. Each channel can be submultiplexed to form multiple lower-rate channels.

ADSL modems provide data rates consistent with North American T1 1.544 Mbps and European E1 2.048 Mbps digital hierarchies (see Figure 15-2) and can be purchased with various speed ranges and capabilities. The minimum configuration provides 1.5 or 2.0 Mbps downstream and a 16 kbps duplex channel; others provide rates of 6.1 Mbps and 64 kbps duplex. Products with downstream rates up to 8 Mbps and duplex rates up to 640 kbps are available today ADSL modems accommodate Asynchronous Transfer Mode (ATM) transport with variable rates and compensation for ATM overhead, as well as IP protocols. Downstream data rates depend on a number of factors, including the length of the copper line, its wire gauge, presence of bridged taps, and cross-coupled interference. Line attenuation increases with line length and frequency and decreases as wire diameter increases. Ignoring bridged taps.

Although the measure varies from telco to telco, these capabilities can cover up to 95% of a loopplant, depending on the desired data rate. Customers beyond these distances can be reached with fiber-based digital loop carrier (DLC) systems. As these DLC systems become commercially available, telephone companies can offer virtually ubiquitous access in a relatively short time.Many applications envisioned for ADSL involve digital compressed video. As a real-time signal, digital video cannot use link- or network-level error control procedures commonly found in data communications systems. ADSL modems therefore incorporate forward error correction that dramatically reduces errors caused by impulse noise. Error correction on a symbol-by-symbol basis also reduces errors caused by continuous noise coupled into a line.

ADSL Technology

ADSL depends on advanced digital signal processing and creative algorithms to squeeze so much information through twisted-pair telephone lines. In addition, many advances have been required in transformers, analog filters, and analog/digital (A/D) converters. Long telephone lines may attenuate signals at 1 MHz (the outer edge of the band used by ADSL) by as much as 90 dB, forcing analog sections of ADSL modems to work very hard to realize large dynamic ranges, separate channels, and maintain low noise figures. On the outside, ADSL looks simple—transparent synchronous data pipes at various data rates over ordinary telephone lines. The inside, where all the transistors work, is a miracle of modern technology. Figure 15-3 displays the ADSL transceiver-network end.

To create multiple channels, ADSL modems divide the available bandwidth of a telephone line in one of two ways—frequency-division multiplexing (FDM) or echo cancellation—as shown in Figure 15-4. FDM assigns one band for upstream data and another band for downstream data. The downstream path is then divided by time-division multiplexing into one or more high-speed channels and one or more low-speed channels. The upstream path is also multiplexed into corresponding low-speed channels. Echo cancellation assigns the upstream band to overlap the downstream, and separates the two by means of local echo cancellation, a technique well known in V.32 and V.34 modems. With either technique, ADSL splits off a 4 kHz region for basic telephone service at the DC end of the band.

ADSL Standards and Associations

An ADSL modem organizes the aggregate data stream created by multiplexing downstream channels, duplex channels, and maintenance channels together into blocks, and attaches an error correction code to each block. The receiver then corrects errors that occur during transmission up to the limits implied by the code and the block length. The unit may, at the user’s option, also create superblocks by interleaving data within subblocks; this allows the receiver to correct any combination of errors within a specific span of bits. This in turn allows for effective transmission of both data and video signals.

ADSL Standards and Associations

The American National Standards Institute (ANSI) Working Group T1E1.4 recently approved an ADSL standard at rates up to 6.1 Mbps (ANSI Standard T1.413). The European Technical Standards Institute (ETSI) contributed an annex to T1.413 to reflect European requirements. T1.413 currently embodies a single terminal interface at the premises end. Issue II, now under study by T1E1.4, will expand the standard to include a multiplexed interface at the premises end, protocols for configuration and network management, and other improvements.

The ATM Forum and the Digital Audio-Visual Council (DAVIC) have both recognized ADSL as a physical-layer transmission protocol for UTP media. The ADSL Forum was formed in December 1994 to promote the ADSL concept and facilitate development of ADSL system architectures, protocols, and interfaces for major ADSL applications. The forum has more than 200 members, representing service providers, equipment manufacturers, and semiconductor companies throughout the world. At present, the Forum’s formal technical work is divided into the following six areas, each of which is dealt with in a separate working group within the technical committee:

• ATM over ADSL (including transport and end-to-end architecture aspects)

• Packet over ADSL (this working group recently completed its work)

• CPE/CO (customer premises equipment/central office) configurations and interfaces

• Operations

• Network management

• Testing and interoperability

ADSL Market Status

ADSL modems have been tested successfully in more than 30 telephone companies, and thousands of lines have been installed in various technology trials in North America and Europe. Several telephone companies plan market trials using ADSL, principally for data access, but also including video applications for uses such as personal shopping, interactive games, and educational programming.

Semiconductor companies have introduced transceiver chipsets that are already being used in market trials. These chipsets combine off-the-shelf components, programmable digital signal processors, and custom ASICs (application-specific integrated circuits). Continued investment by these semiconductor companies has increased functionality and reduced chip count, power consumption, and cost, enabling mass deployment of ADSL-based services.

Very-High-Data-Rate Digital Subscriber Line (VDSL)

It is becoming increasingly clear that telephone companies around the world are making decisions to include existing twisted-pair loops in their next-generation broadband access networks. Hybrid fiber coax (HFC), a shared-access medium well suited to analog and digital broadcast, comes up somewhat short when used to carry voice telephony, interactive video, and high-speed data communications at the same time. Fiber all the way to the home (FTTH) is still prohibitively expensive in a marketplace soon to be driven by competition rather than cost. An attractive alternative, soon to be commercially practical, is a combination of fiber cables feeding neighborhood optical network units (ONUs) and last-leg-premises connections by existing or new copper. This topology, which is often called fiber to the neighborhood (FTTN), encompasses fiber to the curb (FTTC) with short drops and fiber to the basement (FTTB), serving tall buildings with vertical drops.

One of the enabling technologies for FTTN is VDSL. In simple terms, VDSL transmits high-speed data over short reaches of twisted-pair copper telephone lines, with a range of speeds depending on actual line length. The maximum downstream rate under consideration is between 51 and 55 Mbps over lines up to 1000 feet (300 m) in length. Downstream speeds as low as 13 Mbps over lengths beyond 4000 feet (1500 m) are also common. Upstream rates in early models will be asymmetric, just like ADSL, at speeds from 1.6 to 2.3 Mbps. Both data channels will be separated in frequency from bands used for basic telephone service and Integrated Services Digital Network (ISDN), enabling service providers to overlay VDSL on existing services. At present the two high-speed channels are also separated in frequency. As needs arise for higher-speed upstream channels or symmetric rates, VDSL systems may need to use echo cancellation.

VDSL Projected Capabilities

Although VDSL has not achieved ADSL’s degree of definition, it has advanced far enough that we can discuss realizable goals, beginning with data rate and range. Downstream rates derive from submultiples of the SONET (Synchronous Optical Network) and SDH (Synchronous Digital Hierarchy) canonical speed of 155.52 Mbps, namely 51.84 Mbps, 25.92 Mbps, and 12.96 Mbps. Each rate has a corresponding target range:

VDSL Technology

Upstream rates under discussion fall into three general ranges:

• 1.6–2.3 Mbps.

• 19.2 Mbps

• Equal to downstream

Early versions of VDSL will almost certainly incorporate the slower asymmetric rate. Higher upstream and symmetric configurations may only be possible for very short lines. Like ADSL, VDSL must transmit compressed video, a real-time signal unsuited to error retransmission schemes used in data communications. To achieve error rates compatible with those of compressed video, VDSL will have to incorporate forward error correction (FEC) with sufficient interleaving to correct all errors created by impulsive noise events of some specified duration. Interleaving introduces delay, on the order of 40 times the maximum length correctable impulse.

Data in the downstream direction will be broadcast to every CPE on the premises or be transmitted to a logically separated hub that distributes data to addressed CPE based on cell or time-division multiplexing (TDM) within the data stream itself. Upstream multiplexing is more difficult. Systems using a passive network termination (NT) must insert data onto a shared medium, either by a form of TDM access (TDMA) or a form of frequency-division multiplexing (FDM). TDMA may use a species of token control called cell grants passed in the downstream direction from the ONUmodem, or contention, or both (contention for unrecognized devices, cell grants for recognized devices). FDM gives each CPE its own channel, obviating a Media Access Control (MAC) protocol, but either limiting data rates available to any one CPE or requiring dynamic allocation of bandwidth and inverse multiplexing at each CPE. Systems using active NTs transfer the upstream collection problem to a logically separated hub that would use (typically) Ethernet or ATM protocols for upstream multiplexing.

Migration and inventory considerations dictate VDSL units that can operate at various (preferably all) speeds with automatic recognition of a newly connected device to a line or a change in speed. Passive network interfaces need to have hot insertion, where a new VDSL premises unit can be put on the line without interfering with the operation of other modems.

VDSL Technology

VDSL technology resembles ADSL to a large degree, although ADSL must face much larger dynamic ranges and is considerably more complex as a result. VDSL must be lower in cost and lower in power, and premises VDSL units may have to implement a physical-layer MAC for multiplexing upstream data.

Line Code Candidates

Four line codes have been proposed for VDSL:

• CAP (carrierless amplitude modulation/phase modulation)—A version of suppressed carrier quadrature amplitude modulation (QAM). For passive NT configurations, CAPwould use quadrature phase shift keying (QPSK) upstream and a type of TDMA for multiplexing (although CAP does not preclude an FDM approach to upstream multiplexing).

• DMT (discrete multitone)—A multicarrier system using discrete fourier transforms to create and demodulate individual carriers. For passive NT configurations,DMTwould use FDM for upstream multiplexing (although DMT does not preclude a TDMA multiplexing strategy).

• DWMT (discrete wavelet multitone)—A multicarrier system using wavelet transforms to create and demodulate individual carriers. DWMT also uses FDM for upstream multiplexing, but also allows TDMA.

• SLC (simple line code)—A version of four-level baseband signaling that filters the based band and restores it at the receiver. For passive NT configurations, SLC would most likely use TDMA for upstream multiplexing, although FDM is possible.

Channel Separation

Early versions of VDSL will use frequency division multiplexing to separate downstream from upstream channels and both of them from basic telephone service and ISDN (shown in Figure 15-6). Echo cancellation may be required for later-generation systems featuring symmetric data rates. A rather substantial distance, in frequency, will be maintained between the lowest data channel and basic telephone service to enable very simple and cost-effective basic telephone service splitters. Normal practice would locate the downstream channel above the upstream channel. However, the DAVIC specification reverses this order to enable premises distribution of VDSL signals over coaxial cable systems.

Forward Error Control

FEC will no doubt use a form of Reed Soloman coding and optional interleaving to correct bursts of errors caused by impulse noise. The structure will be very similar to ADSL, as defined in T1.413. An outstanding question is whether FEC overhead (in the range of 8%) will be taken from the payload capacity or added as an out-of-band signal. The former reduces payload capacity but maintains nominal reach, whereas the latter retains the nominal payload but suffers a small reduction in reach. ADSL puts FEC overhead out of band.

Upstream Multiplexing

If the premises VDSL unit comprises the network termination (an active NT), then the means of multiplexing upstream cells or data channels from more than one CPE into a single upstream becomes the responsibility of the premises network. The VDSL unit simply presents raw data streams in both directions. As illustrated in Figure 15-7, one type of premises network involves a star connecting each CPE to a switching or multiplexing hub; such a hub could be integral to the premises VDSL unit.

VDSL Technology

In a passive NT configuration, each CPE has an associated VDSL unit. (A passive NT does not conceptually preclude multiple CPE per VDSL, but then the question of active versus passive NT becomes a matter of ownership, not a matter of wiring topology and multiplexing strategies.) Now the upstream channels for each CPE must share a common wire. Although a collision-detection system could be used, the desire for guaranteed bandwidth indicates one of two solutions. The first invokes a cell-grant protocol in which downstream frames generated at the ONU or farther up the network contain a few bits that grant access to specific CPE during a specified period subsequent to receiving a frame. A granted CPE can send one upstream cell during this period. The transmitter in the CPE must turn on, send a preamble to condition the ONU receiver, send the cell, and then turn itself off. The protocol must insert enough silence to let line ringing clear. One construction of this protocol uses 77 octet intervals to transmit a single 53-octet cell.

The second method divides the upstream channel into frequency bands and assigns one band to each CPE. This method has the advantage of avoiding anyMAC with its associated overhead (although a multiplexor must be built into the ONU), but either restricts the data rate available to any one CPE or imposes a dynamic inverse multiplexing scheme that lets one CPE send more than its share for a period. The latter would look a great deal like a MAC protocol, but without the loss of bandwidth associated with carrier detect and clear for each cell.

VDSL Issues

VDSL is still in the definition stage; some preliminary products exist, but not enough is known yet about telephone line characteristics, radio frequency interface emissions and susceptibility, upstream multiplexing protocols, and information requirements to frame a set of definitive, standardizable properties. One large unknown is the maximum distance that VDSL can reliably realize for a given data rate. This is unknown because real line characteristics at the frequencies required for VDSL are speculative, and items such as short bridged taps or unterminated extension lines in homes, which have no effect on telephony, ISDN, or ADSL, may have very detrimental affects on VDSL in certain configurations. Furthermore, VDSL invades the frequency ranges of amateur radio, and every above-ground telephone wire is an antenna that both radiates and attracts energy in amateur radio bands. Balancing low signal levels to prevent emissions that interfere with amateur radio with higher signals needed to combat interference by amateur radio could be the dominant factor in determining line reach.

A second dimension of VDSL that is far from clear is the services environment. It can be assumed that VDSL will carry information in ATM cell format for video and asymmetric data communications, although optimum downstream and upstream data rates have not been ascertained. What is more difficult to assess is the need for VDSL to carry information in non-ATMformats (such as conventional Plesiochronous Digital Hierarchy [PDH] structures) and the need for symmetric channels at broadband rates (above T1/E1). VDSL will not be completely independent of upper-layer protocols, particularly in the upstream direction, where multiplexing data from more than one CPE may require knowledge of link-layer formats (that is, ATM or not).

A third difficult subject is premises distribution and the interface between the telephone network and CPE. Cost considerations favor a passive network interface with premises VDSL installed in CPE and upstream multiplexing handled similarly to LAN buses. System management, reliability, regulatory constraints, and migration favor an active network termination, just like ADSL and ISDN, that can operate like a hub, with point-to-point or shared-media distribution to multiple CPE on-premises wiring that is independent and physically isolated from network wiring.

However, costs cannot be ignored. Small ONUs must spread common equipment costs, such as fiber links, interfaces, and equipment cabinets, over a small number of subscribers compared to HFC. VDSL therefore has a much lower cost target than ADSL because VDSL may connect directly from a wiring center or cable modems, which also have much lower common equipment costs per user. Furthermore, VDSL for passive NTs may (only may) be more expensive than VDSL for active NTs, but the elimination of any other premises network electronics may make it the most cost-effective solution, and highly desired, despite the obvious benefits of an active NT. Stay tuned.

Standards Status

At present five standards organizations/forums have begun work on VDSL:

• T1E1.4—The U.S. ANSI standards group T1E1.4 has just begun a project for VDSL, making a first attack on system requirements that will evolve into a system and protocol definition.

• ETSI—The ETSI has a VDSL standards project, under the title High-Speed Metallic Access Systems, and has compiled a list of objective, problems, and requirements. Among its preliminary findings are the need for an active NT and payloads in multiples of SDH virtual container VC-12, or 2.3 Mbps. ETSI works very closely with T1E1.4 and the ADSL Forum, with significant overlapping attendees.

• DAVIC—DAVIC has taken the earliest position on VDSL. Its first specification due to be finalized will define a line code for downstream data, another for upstream data, and a MAC for upstream multiplexing based on TDMA over shared wiring. DAVIC is only specifying VDSL for a single downstream rate of 51.84 Mbps and a single upstream rate of 1.6 Mbps over 300 m or less of copper. The proposal assumes, and is driven to a large extent by, a passive NT, and further assumes premises distribution from the NT over new coaxial cable or new copper wiring.

• The ATMForum—The ATMForum has defined a 51.84 Mbps interface for private network UNIs and a corresponding transmission technology. It has also taken up the question of CPE distribution and delivery of ATM all the way to premises over the various access technologies described above.

• The ADSL Forum—The ADSL Forum has just begun consideration of VDSL. In keeping with its charter, the forum will address network, protocol, and architectural aspects of VDSL for all prospective applications, leaving line code and transceiver protocols to T1E1.4 and ETSI and higher-layer protocols to organizations such as the ATM Forum and DAVIC.

VDSL’s Relationship with ADSL

VDSL has an odd technical resemblance to ADSL. VDSL achieves data rates nearly 10 times greater than those of ADSL (shown in Figure 15-8), but ADSL is the more complex transmission technology, in large part because ADSL must contend with much larger dynamic ranges than VDSL. However, the two are essentially cut from the same cloth. ADSL employs advanced transmission techniques and forward error correction to realize data rates from 1.5 to 9 Mbps over twisted pair, ranging to 18,000 feet; VDSL employs the same advanced transmission techniques and forward error correction to realize data rates from 13 to 55 Mbps over twisted pair, ranging to 4,500 feet. Indeed, the two can be considered a continuum, a set of transmission tools that delivers about as much data as theoretically possible over varying distances of existing telephone wiring.

VDSL is clearly a technology suitable for a full-service network (assuming that full service does not imply more than two high-definition television [HDTV] channels over the highest-rate VDSL). It is equally clear that telephone companies cannot deploy ONUs overnight, even if all the technology were available. ADSL may not be a full-service network technology, but it has the singular advantage of offering service over lines that exist today, and ADSL products are closer in time than VDSL. Many new services being contemplated today—such as videoconferencing, Internet access, video on demand, and remote LAN access—can be delivered at speeds at or below T1/E1 rates. For such services, ADSL/VDSL provides an ideal combination for network evolution. On the longest lines, ADSL delivers a single channel. As line length shrinks, either from natural proximity to a central office or deployment of fiber-based access nodes, ADSL and VDSL simply offer more channels and capacity for services that require rates above T1/E1 (such as digital live television and virtual CD-ROM access).

" Digital Subscriber Line (DSL) technology is a modem technology that uses existing twisted-pair telephone lines to transport high-bandwidth data, such as multimedia and video, to service subscriber "

Read Post | komentar

Welcome

Recents Download

Link Exchange

Translate

A Revolutionary Wordpress SEO Plugin, Converting Like Crazy! targeted Keywords & SEO Stats At Your Fingertips. Get 1st Page For Highly Searched. [GO TO THE PAGE]

The SEO Conspiracy Reveals The Real Secrets Of Getting Ranked On Page #1 Of Google.com With Website Case Studies! [GO TO THE PAGE]

Backlink Ranking Offers Hundreds Of One-way Contextually Relevant Links To Your Website. Each Month You Get New Links From Our Huge Network Of Over 25,000 Blogs And Social Media Bookmarking Accounts. Links Are Always One Way And Permanent. [GO TO THE PAGE]

Step-by-step Search Engine Ranking Method Allows You To Rank #1 On Google, Yahoo & Bing - Any Niche, Every Time, Guaranteed Or Your Money Back. Includes, 3+ Hrs Video Training, 13+ Hrs Bonus Audio & More. [GO TO THE PAGE]

Here's How You Can Build Thousands Of Incoming Links To Any Website - Generate Stable, Long Term Top Search Engine Rankings For Multiple Keyword Terms - And Send An Endless Supply Of Free Organic Search Engine Traffic To Your Website 365 Days Of The Year. [GO TO THE PAGE]

Seek Out The Best Possible Targets For Your Adwords Placement Campaigns Easily...take Your Direct Link Campaigns To Market Fast With Unique Tracking Ids With The Unique Built In Placement Url Formatting Tool! [GO TO THE PAGE]

One Of The Most Comprehensive Link Building Packages You Will Find On The Internet. Developed By 2 Highly Respected Marketers In The Industry. High Conversion Rate, Low Refund Rate - Customers Are Super Happy With This Brand New SEO Product! [GO TO THE PAGE]

Google Juice Is The Newest Ebook Covering SEO And Social Media To Help Get Anyone (small Business / Bloggers/ Realtors / Musicians / Etc) To The Top Of The Google Search Results And Helps Them get Found Online. Click To Watch The Video! [GO TO THE PAGE]

Proven Method For Creating Google Adsense Sites That Produce Big Money Monthly And Site Flipping Techniques Included [GO TO THE PAGE]

This Guide Lists Everything I Know About Online Marketing. Learn How Successful People Really Make Money Online. 155 Page Report With No Fluff And Detailed Guidelines From An Experienced SEO Company Owner! [GO TO THE PAGE]

Label / Categories

Tracking Your Visitors

Minggu, 29 April 2012

Digital Subscriber Line

Articles List (A-Z)

Search

Archive

Popular Posts

Guestbook

Blog Feed

Follow Our Blog

What is your favourit browser ?

Total Pageviews

Blog Stat

Welcome

Recents Download

Link Exchange

Translate

A Revolutionary Wordpress SEO Plugin, Converting Like Crazy! targeted Keywords & SEO Stats At Your Fingertips. Get 1st Page For Highly Searched. [GO TO THE PAGE]

The SEO Conspiracy Reveals The Real Secrets Of Getting Ranked On Page #1 Of Google.com With Website Case Studies! [GO TO THE PAGE]

Backlink Ranking Offers Hundreds Of One-way Contextually Relevant Links To Your Website. Each Month You Get New Links From Our Huge Network Of Over 25,000 Blogs And Social Media Bookmarking Accounts. Links Are Always One Way And Permanent. [GO TO THE PAGE]

Step-by-step Search Engine Ranking Method Allows You To Rank #1 On Google, Yahoo & Bing - Any Niche, Every Time, Guaranteed Or Your Money Back. Includes, 3+ Hrs Video Training, 13+ Hrs Bonus Audio & More. [GO TO THE PAGE]

Here's How You Can Build Thousands Of Incoming Links To Any Website - Generate Stable, Long Term Top Search Engine Rankings For Multiple Keyword Terms - And Send An Endless Supply Of Free Organic Search Engine Traffic To Your Website 365 Days Of The Year. [GO TO THE PAGE]

Seek Out The Best Possible Targets For Your Adwords Placement Campaigns Easily...take Your Direct Link Campaigns To Market Fast With Unique Tracking Ids With The Unique Built In Placement Url Formatting Tool! [GO TO THE PAGE]

One Of The Most Comprehensive Link Building Packages You Will Find On The Internet. Developed By 2 Highly Respected Marketers In The Industry. High Conversion Rate, Low Refund Rate - Customers Are Super Happy With This Brand New SEO Product! [GO TO THE PAGE]

Google Juice Is The Newest Ebook Covering SEO And Social Media To Help Get Anyone (small Business / Bloggers/ Realtors / Musicians / Etc) To The Top Of The Google Search Results And Helps Them get Found Online. Click To Watch The Video! [GO TO THE PAGE]

Proven Method For Creating Google Adsense Sites That Produce Big Money Monthly And Site Flipping Techniques Included [GO TO THE PAGE]

This Guide Lists Everything I Know About Online Marketing. Learn How Successful People Really Make Money Online. 155 Page Report With No Fluff And Detailed Guidelines From An Experienced SEO Company Owner! [GO TO THE PAGE]

Label / Categories

Tracking Your Visitors

Minggu, 29 April 2012

Digital Subscriber Line

document.write(ultimaFecha);

Articles List (A-Z)

Search

Archive

Popular Posts

Guestbook

Blog Feed

Follow Our Blog

What is your favourit browser ?

Total Pageviews

Blog Stat