Efficient Servers

You are given the task of exploring the streets of a foreign city to determine the shortest drivable distances between any two locations in the city. Doing this yourself is time consuming and does not scale well. Fortunately, you have access to the data reported by the vehicles driving through the city.

Each vehicle traveling through the city regularly logs its location (and the distance traveled between the logged locations). The vehicles send you this data at random times, such as when the vehicle reaches the destination, when it does not drive for a long time, before leaving the city, or when its internal buffer overflows.

Your customers are patrolmen scheduling their morning work shift, technicians fixing traffic lights, and, of course, garbage collectors. They will ask you for the length of the shortest path between the places they care about.

The main goal of this task is to implement a server application that can handle many requests from many clients.

The secondary goal is to become familiar with the documentation and source code of the platforms used and to apply the knowledge acquired in the lectures.

Task assignment

Write a TCP + protobuf-based server application that creates a directed graph data structure based on the data received from the clients, and responds to requests for the length of the shortest path between various locations. The server must meet the following requirements:

  • All TCP transmissions consist of two parts: message length (4 bytes, big endian, unsigned int) + serialized protobuf message.

  • The server must be able to handle a large number of simultaneously connected clients (about 100).

  • Your solution must pass the given time limit, i.e. be green in the web application.

  • The server must respond to each client Request message with a Response message. The Status field in the response is required. Presence of other fields depends on the type of Request.

  • The server builds a directed graph data structure from Walk requests from the clients. In the graph, nodes represent locations and edges store the length of the path between the locations. There is at most one edge between any pair of nodes. The length of the edge does not necessarily correspond to Euclidean distance of edge's end locations. The graph is directed, meaning that an edge from location A to B does not imply anything about the edge from B to A. One Walk never contains the same edge twice.

  • The x and y coordinates of the Location are given in millimeters.

  • Due to the inaccuracy of the measurements, the same physical location may be represented by different Locations sent in the requests. Assume that any two locations with Euclidean distance less or equal to 50 cm represent the same physical location.

  • Multiple Walk requests can contain the same edge, but the lengths of that edge may differ by up to 1 m.

  • The server responds to OneToOne requests with the length of the shortest path between the two given locations (in shortest_path_length field). The shortest path length should be calculated as a sum of average edge lengths. The average edge length is calculated from all Walk requests that contained that edge using integer division, i.e. rounding the result towards zero.

  • The server responds to OneToAll requests with the sum of lengths of existing shortest paths between node origin and all other nodes. Each such a length should be calculated as for OneToOne request. The resulting value is sent in the total_length field.

  • When calculating path lengths to respond OneTo* requests, the following rules determine which Walk request to include in the calculation.

    The following walks must be included:

    • R1: All Walk requests received from the same client (in the same connection) before the OneTo* request.
    • R2: All Walk requests received from other clients before the server responded a OneTo* request to that client. That is, the OneTo* response represents a synchronization point between clients.

    The following walks may or may not be included:

    • R3: Other received Walk requests not covered by rules R1 or R2.

    Consider this example with two clients A and B where W represents a Walk request and O an OneTo* request:

    A: W₁W₂O₁W₃W₄O₃
    B: W₅W₆W₇W₈O₂
       → time
    

    Response to O₁ must include data from walks W₁ and W₂ (R1). If you want, you can also include some or all data from W₅ and/or W₆, which were received before O₁ (R3). Response to O₂ must include data from walks W₁, W₂ (R2), and W₅, W₆, W₇ and W₈ (R1). You can include W₃ if you want (R3). Response to O₃ must include data from all walks (R1, R2).

  • After the last request, the client closes its side of the connection and the server should respond with closing the serve side too.

  • The Reset request should cause the server to reset all its state (e.g. previously received walks). The tester will issue this request once, at the beginning of the test. This should allow you to reliably run multiple tests without restarting your server.

Your server can be implemented in any programming language. To ensure fair competition, everybody should run it on the same hardware. Therefore, only the results from running your server on ritchie.ciirc.cvut.cz will count for the final ranking. Be prepared that the ritchie server is shared by many people and load of other user's tasks may temporarily slow down your solution. Especially shortly before the deadline, the interfering load might be quite high.

Although the final ranking will consider only the results from the ritchie server, we welcome experimenting with other hardware platforms and/or operating systems.

If you have any questions, please ask via GitLab Issues.

Protobuf schema

Use the following schema for communication with our clients:

syntax = "proto3";

option java_multiple_files = true;
option java_package = "cz.cvut.fel.esw.server.proto";

message Request {
  oneof msg {
    Walk walk = 1;
    OneToOne oneToOne = 2;
    OneToAll oneToAll = 3;
    Reset reset = 4;
  }
}

message Walk {
  repeated Location locations = 1; // repeated n times, determines edge directions
  repeated uint32 lengths = 2; // [mm], repeated n-1 times
}

message OneToOne {
  Location origin = 1;
  Location destination = 2;
}

message OneToAll {
  Location origin = 1;
}

message Reset {}

message Location {
  int32 x = 1; // [mm]
  int32 y = 2; // [mm]
}

message Response {
  enum Status {OK = 0; ERROR = 1;};

  Status status = 1; // Always present
  string errMsg = 2;
  uint64 shortest_path_length = 3; // [mm]
  uint64 total_length = 4; // [mm]
}

Server performance

The server's performance is measured with a test program that sends a larger amount of data and then repeatedly asks for the shortest path between any two locations.

The total time from the start of the communication to the receipt of the last response is measured. In order for your solution to pass, this time must be less or equal to 7 seconds.

Solution testing and measuring

To test your server, use the web application and specify:

  • the IP address of your server (any public IP address or ritchie.ciirc.cvut.cz as described below)

  • the programming language in which you wrote your solution.

Also, upload the final working solutions to the upload system. The code in the upload system must be complete and include brief instructions on how to compile and run it. All your tests must be reproducible. You may be asked to reproduce similar results that you have in the web application during the tutorial to prove that your solution is correct.

Static test data

You can use the files listed below for local testing and debugging. For each file, we show expected answers to the OneTo* requests in the file. You can use the netcat (nc) program to send the files to your program via network, e.g.:

nc localhost 1234 < file.pbf

Simple, short

  • file name: walk1nodes3-2.pbf
  • number of walks: 1
  • number of locations: 3
  • OneToOne.shortest_path_length = 161855
  • OneToAll.total_length = 58260

Simple, longer

  • file name: walk1nodes100-2.pbf
  • number of walks: 1
  • number of locations: 100
  • OneToOne.shortest_path_length = 19545308
  • OneToAll.total_length = 941824954

Two walks with common edge

  • file name: walk2nodes6.pbf
  • number of walks: 2
  • number of locations: 6
  • OneToOne.shortest_path_length = 41691
  • OneToAll.total_length = 664267

Many walks for performance tuning

  • file name: walk3000.pbf
  • number of walks: 3000
  • number of locations: 1679366
  • OneToOne.shortest_path_length = 248408794
  • OneToAll.total_length = 23914104899237
  • file name: static.pbf
  • number of walks: 30
  • number of locations: 14414
  • number of OneToOne requests: 3
  • number of OneToAll requests: 1
  • OneToOne.shortest_path_length = [145643366, 187086324, 338065539]
  • OneToAll.total_length = [18451560506287]

Warning: The numbers for static.pbf file are currently incorrect!

Ritchie server

For final evaluation, you should use our server Ritchie. You can access it by running ssh username@ritchie.ciirc.cvut.cz with the main CVUT password. This server is equipped with powerful Intel processors running Linux and is connected to the testing client that sends requests via a 10 Gbps Ethernet.

Unfortunately, the server is behind a firewall that blocks all connections to the outside except SSH. Therefore, when configuring the server in the web application, use the hostname ritchie.ciirc.cvut.cz instead of the public IP address.

Contest

Eight fastest solutions will get 8, 7, 6, 5, 4, 3, 2 and 1 additional points each.

Presentation of the best results

The last lab is reserved for the presentation of the best solutions. Be ready to present your solution.

Evaluation

12 points is awarded if your solution passes the time limit.

Up to 8 of the total 20 points are awarded for the quality of the solution and the quality of the code, which are subjective. The subjective evaluation includes: used algorithms and techniques, readability of the code, readme and project overview, git history, consistency of the code and naming conventions.

Readme should include one or more paragraphs that include: what is the purpose of the application (task assignment reformulated), brief description of algorithms and techniques you used to solve the challenges present in the task, complete instructions required to compile and run your application, used libraries and what you think should be emphasized for an average programmer that would encounter your source code when browsing public repositories (hyphotetically! do not publish your source code!).