The Internet is a large, complex space with a lot of processes and data structures intertwined all over the place. This rich soup gives rise to a range of unexpected tricks and techniques, some of which are described in the following sections:
The Virtual City Model :
You can think of a web site like a city, where the domain name itself is the center square of the city, the folders in the path names are streets, and each web page is a building. Some buildings are closer to the centre square than others (shorter URL's). Some are connected to other buildings in the city by side streets, tunnels, and alleyways (i.e., links).
When you jump from one section of a site to another, you jump up into the air and parachute down into a new neighborhood. Some buildings have links to entirely different sites, which is like stepping into a transporter beam that takes you to a related building but in a different town.
You can use this model to walk towards the centre square of a city by manually editing any URL, for example by successively delete the right-most words of the following URL to walk up the side streets towards the centre square:
http://www.cnn.com/TECH/computing/comics/index.html
http://www.cnn.com/TECH/computing/comics/
http://www.cnn.com/TECH/computing/
http://www.cnn.com/TECH/
http://www.cnn.com/
A similar process may be exercised on any URL,
although some addresses will jump to a default page or return a blank page because
there is nothing built at that location. There is often more construction closer
to the city center.
Watching Other Searches
Some search sites display some of the searches other people are making on their site in real-time. These sites receive thousands of searches a minute, so you only actually see a small, random subset of all of the searches, and there is no way to know who is making the searches that you do see. However, they provide an interesting insight into the current interests of other Internet searchers around the world.
Searching URL's
Some search engines let you specify that a search keyword must be present in the URL. For example, Alta Vista uses the search term "url:keyword", Infoseek also uses "url:keyword", Yahoo uses "u:keyword", and Northern Light provides a separate text field.
This functionality lets you narrow your search to sites with certain words actually in the URL. For example, you can search Alta Vista for sites with the word "garden.com" in the URL with the search url:garden.com.
You can also use this technique to search a site that doesn't provide its own search engine. (For example, the search function on the LivingInternet.com site uses this feature.) You compose your search as normal, and then add a URL keyword to limit the search to a particular site.
The form of the URL search command is different for each search site, often found on the site's advanced search page, usually described in the site's help section, and identified in the feature summaries provided in the Search Sites section. A few examples are shown below:
url:www.thetrip.com and florida -- Alta Vista search of the site TheTrip.com for pages including the word "florida".
url:yahoo.com and garden -- Alta Vista search
of Yahoo for the word "garden".
site:livinginternet.com ""words actually in the URL"" --
Google search of LivingInternet.com for the phrase "words actually in the
URL".
Monitoring Web Changes
You can monitor when your favorite web pages are updated.
You can sometimes determine when a web page file was last modified with the about feature by entering "about:URL" in the location field, as in the following:
about:http://www.ncsa.uiuc.edu/SDG/Experimental/demoweb/
old/marc-info.html
This feature can sometimes be useful for comparing
the dates of pages, to check how old a page is, or for checking the time-stamp
on news pages of various news services when checking late breaking news stories.
However, not all web servers support this feature, so it only works on some
pages.
Translating Web Pages
You can automatically translate online content. There are a range of sites that provide language translation. The translations will not be perfect, but often make a passable version that can be generally understood. Some of the sites also offer an option to paste source text directly into a translation box.
The following web sites provide automatic language translation services.
Alta Vista - Has a language translation site
that uses software from Systran to translate any web page from and to several
languages.
Internet Translator - The InterTran page provides language translation between
729 different language pairs.
Multi-Language Translator - Translates between more than 70 language pairs.
T-Mail - Translation by email.
Linking TV And The Web
You can create your own dynamic TV to Internet links by searching for words, names, and phrases you hear on television shows while you are watching them. This technique is often useful when watching a story about something you have not heard of before. supposer
Mr. Kumar. You are watching an interview with a famous artist named Kumar, You can get more information to fill you in on his background as you watch the interview:"Mr. kumar"
uessing Related Site Names
If you find a site name has a fourth-level or higher domain name, you can try removing the left most parts to see if there are related sites. For example, you can change
www.main.twenty.net ---> www.twenty.net
If a site name includes a number, you can try related numbers. For example, you can change
www3.twenty.net ---> www2.twenty.net
Similarly, you can change:
www.twenty.net/news/story1.html --->
www.twenty.net/news/story2.html
You can also change site suffixes to explore related sites. Sometimes there
are related sites, sometimes they are owned by the same company, and sometimes
there isn't anything there. For example, you can try each of the following sites:
www.time.com
www.time.org
www.time.net
You can also try different prefixes, such as
"ftp" and "support" in place of "www".
You can use the web as a spell checker, and search for misspelled words for the entertainment:
Spelling. You can use the web as a kind
of huge spell checker if you don't have anything else handy. For example, let's
say you aren't sure of the spelling of "committee". You can search
on the most likely options, commitee, committee, comitee, and the search that
returns the most hits is correct. This is a very reliable check, since more
people spell words correctly than incorrectly, and search engines have such
large databases that they provide a statistically accurate sample.
Misspelling. For good reasons and bad, many people have created sites with domain
names that are misspellings of well known site names, and they count on people
dropping into their site by mistake when they misspell the legitimate site's
name. Commercial organizations usually bring lawsuits to close these copycat
sites down.
In a similar idea, we can search the web for
misspellings of words for pure entertainment. It is interesting to find the
number of sites that have misspelled pretty much any word you can think of,
in almost any way you can think of. For example, it is amazing how many pages
there are containing the words zology, orgnization, and gardning. Even more
challenging is searching for multiple misspellings that still return results,
like commitee and orgnization.
Site domain names are made in the form "www.domain.suffix", where common suffixes include ".com", ".org", ".net". There are several hundred words used in everyday conversation. You can parachute into web sites all over the world, without the help of any reference or search engine, just by typing in a word you are curious about, together with one of the common suffixes.
For example, you might parachute into the middle of the following cities, just to see what is there.
www.book.org
www.car.net
www.five.com
www.freedom.com
www.movies.com
www.one.com
This will work for most common words, because most common domain names have
already been registered by someone for something. By making up domain names
containing words of particular interest to you, you can retrieve an interesting
cross-section of the organizations that make up the current web.
Compound words and phrases often work as well (like "LivingInternet"), although the more unlikely the combination the less likely it will exist.
You can also parachute into the web sites associated
with the domain names in the addresses of people who send you email, dropping
into the web site of their Internet Service Provider. For example, if you get
an email from someone at twenty.net, you can visit http://www.twenty.net/ just
to see what is there.
Finding Paths And Speeds
You can diagnose and display the network path between your computer and a web site, and compare the speed of one web site to another:
Paths. If you can't reach a web site, you can use the ping utility to ping the site's domain name. If the site returns a response, then the site and network are fine, but the web server is down. If the ping fails or times out, and you can still ping other sites like yahoo.com, then the web site's entire network is likely down.
You can trace the path that your communications
with a web site take over the net with the traceroute utility, showing the length
of the path, and the specific routers at which each packet stops along the way.
Speed. If you have a choice of a number of mirror web sites, or if you have
a choice of which site to download a large file from, or if you are just curious,
then you can use the ping utility to determine which site has the fastest response
time. Just ping each web site, and then use the one with the minimum response
time.
You can often find a site's URL by searching for a unique identifier, or by searching the newsgroups. Sometimes you hear about a site in conversation but don't know the URL, and would like to be able to find it. Often the following can help:
Search Engines. Try the techniques described
in the expert search section to search for the site in the major search engines.
In particular, try searching for some piece of unique information about the
site. If you know a unique name associated with the site, or even a part of
the URL, you can often find it.
Newsgroups. If the site is very new, it may not yet be indexed by the search
sites. However, if a site has any notoriety at all it will probably have been
discussed in the newsgroups, and you can usually find it by searching Usenet.
Limiting the search to a date range of the past few days is often effective
when the site has recently been of interest in the news -- several people somewhere
are almost certain to have posted messages about it.
Web robots search the net to build search engine databases. Martijn Koster designed the file "robots.txt" for use by an Internet site to tell search engines which directories and files to exclude from their search robot scans. For example, a directory might be excluded because it contains very dynamic information, like topical news data, and the pages would either be missing or changed when later retrieved through a search engine database.
You can read the robots.txt file for a site by opening the site name appended with "robots.txt", such as shown below:
http://www.livinginternet.com/robots.txt
http://www.excite.com/robots.txt
http://www.cnn.com/robots.txt ***
Resources. The following sites provide more information about web robots:
Indexing The World (From 1994.)
Martijn Koster's Web Robots Pages ***
Search Engine Watch Spider Spotting Chart
Spider Spotting
Web Robots Database
Web Robots Pages
http://www.tardis.ed.ac.uk/~sxw/robots/list.html
| Back |