Hit Parade

For this programming exercise you'll write code that analyzes the logfiles from a webserver. The logfiles contain multiple lines in the format below with two strings on each line.
   IP_address URL
   IP_address URL
   ...
For example, the first three lines of tinylog.txt are shown below.

32.101.160.49   http://www.cs.duke.edu/~magda/Lines/sounds/ip.au 
32.101.160.49   http://www.cs.duke.edu/~magda/Lines/sounds/ip.au 
172.157.127.206 http://www.cs.duke.edu/education/courses/cps130/fall98/lectures/lect13/node7.html 

The file biglog.txt contains more than 30,000 lines, the tinylog.txt contains 50 lines.

Hit Parade

The code below uses the class WeblogIterator to print all the URLs in a logfile.

    public static void main(String args[])
    {
	WeblogIterator wlogger = new WeblogIterator();
	wlogger.open("biglog.txt");

	while (wlogger.hasNext()) {
	    String [] line = (String[]) wlogger.next();
	    System.out.println(line[1]);
	}
    }

Each element returned by the WeblogIterator.next method is a two-element String array with the IP value in the index zero location and the URL in the index one location.

Write a program to determine which URL in a logfile has the most unique hits. If the URL is visited by the same IP address several times, this should count as only one hit from that IP address. The URL with the most unique hits is the one visited by the most different IP addresses.

Definintion of Map: Collection of objects that can be indexed.

Map, also called a Table or Dictionary is composed of:

Hints:

To traverse a Map

	Iterator it = map.ketSet().iterator();
	int max = 0;
	String smax = null;
	while (it.hashNext())
	{
		String s = (String) it.next();
		int count = ((COunter)map.get(s)).getValue();
		if (count > max)
		{
			max = count;
			smax = s;
		}
	}