Untitled Document

NETWORK MEASUREMENT

Bandwidth/throughput Measurement:
- CAIDA has very good third-party bandwidth/throughput measurement tools comparison here.
- Stanford Linear Accelerator Center has a good list of network monitoring tools here.
- Effective Bandwidth
- Network Characterization Service (NCS)
- PATHRATE & PATHLOAD
- PathChirp
- NetTimer
Summary of the methods to measure the available bandwidth ......

NETWORK TOPOLOGY DISCOVERY

For the layer 3 topology, Traceroute is a good tool ( if the machine responses well for the ICMP message). But if you want to find layer 2 topology, traceroute will not work, since layer 2 is transparent to it.

If you want to find the layer 2 topology you have to use the SNMP, and in the following paper some ideas were proposed to solve the problem. They all need to get info from MIB.

Also how to find the hidden layer 2 switch between the layer 3 device is a good problem, based on some BW measurement tools, the difference result from 2 different BW measure methods will provide some useful info about the existence of layer 2 devices.

References:
1. Topology Discovery in Heterogeneous IP Networks, Yuri Breitbart, Minos Garofalakis, Cliff Martin, Rajeev Rastogi, S. Seshadri, Abraham Silberschatz. Proceedings of IEEE INFOCOM 2000, Tel Aviv, Israel, March 2000.
2. Physical Topology Discovery for Large Multi-Subnet Networks,Yigal Bejerano, Yuri Breitbart, Minos Garofalakis, Proceedings of IEEE INFOCOM '2003.
3. Topology Discovery for Large Ethernet Networks, B. Lowekamp, D. R. O’Hallaron, and T. R. Gross,Proceedings of ACM SIGCOMM,San Diego, California, Aug. 2001.
4. Octopus project: Network Topology Discovery at Cornell,Representative paper: Discovering neternet Topology, R. Siamwalla, R. Sharma and K. Keshav, Proceedings of IEEE INFOCOM '99.
5. Heuristics for Internet Map Discovery, Ramesh Govindan, Hongsuda Tangmunarunkit,Proceedings of IEEE INFOCOM 2000, Tel Aviv, Israel, March 2000.
6. A brief explanation for finding the hidden switch

But if you want to identify the internal logical topology, only from the end-to-end measurement, you may need network tomography. This is a new field which benefit greatly from the wealth of statistical theory and algorithms. And the topology identification is one promising application.

References:

Network Tomography: recent developments

Internet Tomography

IEEE INFOCOM 2003 "Traffic Monitoring and Topology Inference"

Maximum Likelihood Network Topology Identification from Edge-based Unicast Measurements.

Maximum Likelihood Network Topology Identification from End-to-end Unicast Measurements. (By using the sandwidth probes, we can get sample about the queue delay along each link. Then we build the maixmum likelihood function, and use it to infer the tree topology which will most likely fit the real topology. And a novel Markov Chain Monte Carlo procedure was developed for the rapid determination of the most likely topologies.)

Network Tomography and the identification of share infrastructure.

Discovering General Logical Network Topologies ( Based on the result from 5 )

CLUSTER BY NETWORK DISTANCE

Given N end hosts, which belong to different administrative domains, how could we divide them into different clusers or groups by the network distance between them without knowing the info about the underlying topology ?

There are several projects related to this problem:

Global Network Position (GNP)
- The key idea is to represent the complex structure of the Internet by a simple geometric space (e.g. an N-dimensional Euclidean space). In this representation, each host in the Internet is characterized by its position in the geometric space with a set of geometric coordinates. If the representation is accurate, then the easily computable geometric distances between hosts in this geometric space can accurately approximate the Internet network distances, and no actual network measurements are required.
- If the number of landmark is D, then the vector will be D-dimension. And the distances of all paths between N end hosts can efficiently communicated by K sets of coordinates of D numbers each (i.e. O(KD) of data), as opposed to K(K-1)/2 individual distances (i.e., O(K^2) of data).
- As to the network distance, the metric is round-trip propagation and transmission delay, a relatively stable characteristic between hosts.

Internet Distance Map Service (IDMaps)

The objective of this research is to explore scalable design alternatives for an architecture to provide Internet Distance Maps Service (IDMaps). In IDMaps, special HOPS servers maintain a virtual topology map of the Internet consisting of end hosts and special hosts called Tracers. This virtual topology map is used to predict Internet distance. For example, the distance between hosts A and B is estimated as the distance between A and its nearest Tracer T1, plus the distance between B and its nearest Tracer T2, plus the shortest path distance from T1 to T2 over the Tracer virtual topology. As the number of Tracers grow, the prediction accuracy of IDMaps tends to improve.
Where to put place the Tracer is an interesting problem.IDMaps address this in two different ways, namely the K-HST and the minimum K-center algorithms. These algorithms have been used to determine placement of fire stations, ambulance placement.

Internet Iso-bar & The Wide-area Network Measurement and Monitoring Services(WNMMS)

Internet Iso-bar integrated the ideas of IDMaps and GNP together. It develops the network distance idea, and introduces the correlation distance concept. Based on this, it use the K-HST and K-center algorithm to implement the clustering.
WNMMS provide good refrences on the related works.

Based on all these related works, I am going to use the algorithm used in ECO (Effective Collective Operation) and the network distance to cluster the hosts. Here, you can find more about ECO.

Select the landmark.
I will use the Domain Name Sever(DNS) of each hosts as the landmarks. But each time a new DNS is added the dimension of the space will increased by one, then we may need to recaculate all the network distance again. We should find an optimal solution, that is mean only add new DNS as the landmark when it is necessary.
Get the coordinate of each hosts.
Suppose that there are n hosts ( H(i), where i=1, 2, ...,n) , and m landmarks ( L(j), wherej=1,2,3, ... , m).

Cor[H(i)] = { Xi1, Xi2, ..., Xij, ... , Xim}, where Xij=Latency[H(i), L(j)]
Caculate the network distance (and the vector similarity) based on the coordinates
D(H(i),H(j)) = sum of (Xik-Xjk)*(Xik-Xjk), where k=1,2, ... , m
Use the following algorithm to cluster the hosts.

Put all the hosts in a graph, where the vertexes are the hosts, and the edge between them is the network distance that we get from the measurement.

initialize clusters to empty

for all hosts

host.min_edge = minimum cost edge incident on node

sort edges by nondecreasing cost

for all edges(a,b)

if a and b are in the same cluster

continue

if (edge.weight > Q1 * node(a).min_edge) or (edge.weight > Q1 * node(a).min_edge)

continue

if node(a) in a cluster

if ( edge.weight > Q2 * node(a).cluster.min_edge )

continue

if node(b) in a cluster

if ( edge.weight > Q2 * node(b).cluster.min_edge )

continue

merge node(a).cluster and node(b).cluster

set new_cluster.min_edge=min( node(a).cluster.min_edge, node(b).cluster.min_edge )