src/DEVELOPMENT


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

Architecture
===========
 - Current implementation follows UFO architecture: reader and dataset-builder are split in two filters.
  * The reader is multi-threaded. However, only a single instance of the builder is possible to schedule.
  This could limit maximum throughput on dual-head or even signle-head, but many-core systems.
  * Another problem here is timing. All events in the builder are initiaded from the reader. Consequently,
  as it seems we can't timeout on semi-complete dataset if no new data is arriving.
  * Besides, performance this is also critical for stability. With continuous streaming there is no problem,
  however, if a finite number of frames requested and some packets are lost, the software will wait forever
  for missing bits.


Problems
========
 - When streaming at high speed (~ 16 data streams; 600 Mbit & 600 kpck each), the data streams quickly get
 desynchronized (but all packets are delivered).
    * It is unclear if problem is on the receiver side (no overloaded CPU cores) or de-synchronization is first
    appear on the simmulation sender. The test with real hardware is required.
    * For border case scenarios, increasing number of buffers from 2 to 10-20 helps. But at full speed, even 1000s
    buffers are not enough. Packets counts are quickly going appart.
    * Further increase of packet buffer provided to 'recvmmsg' does not help (even if blocking is enforced until 
    all packets are received)
    * At the speed specified above, the system works also without libvma.
    * Actually, with libvma a larger buffer is required. In the beginning the performance of libvma is gradually 
    speeding up (that was always like that). And during this period a significant desynchronization happens. To
    compensate it, we need about 400 buffers with libvma as compared to only 10 required if standard Linux 
    networking is utilized.
 - In any case (LibVMA or not), some packets will be lost in the beginning if high-speed communication is tested.
    * Usually, first packets are transferred OK, but, then, a few packets will be lost occasionally here and there
    (resulting in broken frames). This basically breaks grabbing a few packets and exitig. Unclear if server- or 
    client-side problem (makes sense to see how real-hardware will behave).
    * Can we pre-heat to avoid this speeding-up problem (increase pre-allocated buffers, disable power-saving 
    mode, ??) Or it will be also not a problem with hardware? We can send UDP packets (should be send from another
    host), but packets are still lost:
        for i in $(seq 4000 4015); do echo "data" > /dev/udp/192.168.34.84/$i; done
    * The following doesn't help: new version of libvma, tunning of the options.
 - Communication breaks with small MTU sizes (bellow 1500), but this is probably not important (Packets are 
 delivered but with extreme latencies. Probably some tunning of network stack is required).
 - Technically, everything should work if we start UFO server when data is already streamed. However, the first
 dataset could be any. Therefore, the check fails as the data is shifted by a random number of datasets.


Questions
=========
 - Can we pre-allocate several UFO buffers for forth-comming events. Currently, we need to buffer out-of-order
 packets and copy them later (or buffer everything for simplicity). We can avoid this data copy if we can get
 at least one packet in advance.

 - How I can execute 'generate' method on 'reductor' filter if no new data on the input for the specified 
 amount of time. One option is sending empty buffer with metadata indicating timeout. But this is again 
 hackish.

 - Can we use 16-bit buffers? I can set dimmensions to 1/4 of the correct value to address this. But is it
 possible to do in a clean way?
 
 - What is 'ufotools' python package mentioned in documentation? Just a typo?