Standardization of frames formats for SatNOGS DB

EA4GPZ · December 6, 2020, 12:18pm

Hi all,

Currently we have different applications that can decode telemetry from different satellites and submit the frames to the SatNOGS DB. These are, at least, UZ7HO Soundmodem + DK3WN Telemetry Forwarder, gr-satellites, and SatNOGS itself.

For satellites using AX.25, the common practice when submitting to the DB is to send the full AX.25 frame, without 0x7e HDLC flags and without CRC-16. However, for other satellites it is not so clear what constitutes a “frame”, and there can be different reasonable choices of what to send to the DB.

In some cases, the different applications follow slightly different conventions in their definition of “frame” when sending to the DB. This is causing entries with slightly different formats in the DB, which is undesirable.

To improve the compatibility of all these applications, and have consistent information in the DB, we should have an open technical discussion to agree on a common definition of the format of the frames that should be sent to the DB for each satellite/modem/protocol.

I have already done some preliminary discussion with Mike DK3WN, Jan PE0SAT and Andy UZ7HO, and we have identified the following points for discussion:

A. Whether the 4 byte CSP header should be swapped in frames transmitted by a GOMSpace AX100 or U482C transceiver.

B. Apparently, the optional CRC-32 in CSP packets transmitted with a GOMspace radio is missing in some SatNOGS frames. Andy and I consider that the CRC-32 is part of the frame and should be included in the DB.

C. Apparently there are some differences in frames from AAUSAT-4. The difference might have to do with the Frame Sync Marker that comes before the convolutionally-encoded data. Check differences between Soundmodem, gr-satellites and SatNOGS and decide whether to keep or drop the Frame Sync Marker.

D. Andy has noted that some of the MOBITEX satellites have seemingly erroneous decodes in SatNOGS. Check whether the SatNOGS MOBITEX decoder works at all and decide if we should include “errorcodes” in the frames sent to the DB (the errorcodes are not transmited over the air but included by the BEESAT GNU Radio decoder at the end of the frame to indicate which blocks were correctly FEC-decoded).

Please add any other topics that you think should be treated. The goal is to discuss these (and perhaps other) points and iron out all the small differences between the applications in order to have a consistent format in the DB entries.

I’ll put more information and opinion in a second post in this thread.

EA4GPZ · December 6, 2020, 12:33pm

For me the trickiest point is “A”, regarding the swapping of the CSP header in the AX100 and U482C frames, so I’ll give some background information here.

The CSP header has 32 bits, but for me it is a bit confusing whether the header should be big-endian, little-endian, or both encodings are supported. The relevant code in libcsp is here. The question boils down to whether the flags should be the first byte of the header or the last byte.

In the original implementation of the AX100 decoder by Morten Jensen from GOMspace, he swapped the CSP header. The effect of this, with GOMX-3 (probably the first satellite using the AX100) and all the other satellites that I’ve seen since then is that the flags are put in the first byte of the header after the swap.

I inherited this swap in gr-satellites, and since 2016 I have been swapping out the CSP header in AX100 frames as well as in U482C frames, in order to obtain the flags in the first byte of the frame. Therefore, I guess that there are already a large number of frames in the DB following this convention, which would be a reason for continue doing the CSP header swapping.

On the other hand, Andy mentions the following reasons that could justify stopping to do the CSP header swapping.

First, some of the CSP frames have a CRC-32C. The presence of the CRC should be indicated by one of the flags in the header, but sometimes the flag is not set correctly. The CRC-32C computation can include or not include the CSP header. This is not indicated explicitly anywhere. In the case that the CRC-32C includes the header, Andy says that swapping the CSP header makes the CRC-32C fail, so the CSP header needs to be swapped back again to check the CRC. I’ll have to check this carefully, but I can comment that the implementations of the CRC-32C are a bit different between different satellites. Sometimes I’ve needed to swap the endianness of the CRC-32C itself.

Another reason for not swapping is OPS-SAT. OPS-SAT transmits CSP frames encapsulated as a Reed-Solomon codeword inside an AX.25 frame. The CSP frames come with the flags in the last byte of the header. OPS-SAT has an external telemetry application that gets these CSP frames as input, and requires them not to be swapped (i.e., have the falgs in the last byte). Consistency between OPS-SAT and all the satellites using CSP through the AX100 Reed-Solomon (mode 2) and ASM+Golay (mode 5) modes would be a reason for ensuring that the flags are in the last byte of the header (i.e., not swapping).

dernasherbrezon · December 6, 2020, 1:58pm

Oh! I have a lot of things to say.

First of all we should treat separately satellite-specific decoding and generic decoding. For example:

Generic decoder - CSP packets
Satellite-specific decoder - OPS-SAT packets (CSP packets inside AX.25 packets).

The secondly, I would operate using packet or frame terms, rather than “AX.25 packet with/without CRC”. Let me give some examples. AX.25 defines frame:

So if we agree on storing AX.25 frames, then we should say “database contains AX.25 frames”.

For the satellite-specific frames/packets, we can define custom format. For example, “database contains CSP packets for OPS-SAT”.

This give us the following:

This would be very clear for everyone outside, what the data structures are. They can google ax.25 frames and decode them. As an another example, we can say “we store VCDU frames from LRPT protocol” and it would be clear what they are.
Pretty clear what to store. For example, if we say “we store MOBITEX frames”, then it would be pretty clear that we should store interleaved blocks with parity bytes. But if we say “we store MOBITEX packet”, then it would be just header + payload:

Easy to agree on generic format. For example, we can discuss if we need to store AX.25 frames or AX.25 packets.

Drawbacks:

Decoders should be changed. Currently jradio and gr-satellites output AX.25 packets, rather than AX.25 frames. So if SatNOGS decides to store AX.25 frames, we should adjust our code.
Potentially more data in the database. AX.25 frames contain starting and ending flags.
Decoders should prepend preamble and sync keywords for data link frames. Currently preamble and sync used for clock synchronisation and finding frame boundaries. They are normally discarded at decoders. Additionally we are searching for partial match of synchronisation marker. This would increase the rate of correctly decoded packets. But if database would require us to send this synchronisation marker, then I guess we should prepend pre-defined marker.

EA4GPZ · December 6, 2020, 2:26pm

Thanks for your reply!

To be clear, the goal here is to come up with an agreement on what to do, so pontentially software will need to be changed to follow this agreement, but it will be much better in the long run.

Ideally, for each satellite we would have an unambiguous description of what should go in the DB. There are families of satellites that use the same protocols (AX.25 satellites, each of the AX100 modes, MOBITEX, the FUNcubes, etc.), so they can be described together as a group.

For AX.25, the common usage is to store in the DB only the Address, Control and FCS fields from the Figure 3.1a. The FCS and Flags are thrown away. I can’t think of a reason why we should change this usage.

In general, I think that it is better not to store in the DB the information that is typically discarded during the decoding process and which can be recovered from the useful data in the “packet”. This typically includes preambles, syncwords, FEC symbols (such as Reed-Solomon parity check bytes), and sometimes CRCs.

CRCs are a more difficult topic, because they are sometimes almost mandatory to check to avoid many false decodes (such as in AX.25) and sometimes are optional and often not checked (such as CSP).

There are some cases where I think it is difficult to arrive to a good decision. For instance, when there are several protocol layers or fragmentation. For the layers case, take OPS-SAT as an example: should we store the CSP frames or the full AX.25 frames? What happens if the AX.25 had errors that we don’t care about? For the fragmentation case, the problem is when the fragments don’t have a serial number, so defragmentation is easy in the receiver, because the fragments arrive in order from a single source, but not so easy once the frames are in the database. Some of the satellites from Harbin Institute of Technology are an example of this, since they sent their packets fragmented in a KISS stream.

However, I think that most of the cases are much easier than this, so the more problematic cases can be treated last.

dernasherbrezon · December 6, 2020, 3:31pm

I can’t think of a reason why we should change this usage.

Agree. This is fine. We should clearly say that database has AX.25 packets instead of AX.25 frames.

There are some cases where I think it is difficult to arrive to a good decision.

With my approach it would be clear. If we store the packet, then everything that is defined in the packet format should be stored. If packet contains CRC, then store CRC.

If CRC or header calculated incorrectly on satellite, then such packets won’t be able to decode with generic AX.25. This is fine. That just mean the satellite protocol is custom and need dedicated decoder.

better not to store in the DB the information that is typically discarded during the decoding process and which can be recovered from the useful data in the “packet”.

That might be problematic for MOBITEX So let’s say we agreed on storing MOBITEX packets. I.e. header + payload. That means decoder received the frame, de-interleaved the data, applied any FEC and validated checksums. But what should we store if single block failed? Discard whole frame or fill the gap with zeros? In my decoder I’m able to define “gaps” in the data and partially recover information. Here is explanation: Декодирование телеметрии D-STAR ONE • dernasherbrezon and the implementation: jradio/src/main/java/ru/r2cloud/jradio/util/GapDataInputStream.java at 73ba0ef0aa0ce3b5ef6255d170fe33f620febce0 · dernasherbrezon/jradio · GitHub Storing the zeroes for missing data can lead to incorrect telemetry. I.e. 0 is a valid telemetry value. And it is not possible to store “null” in the byte array/stream. So keeping original MOBITEX frame - is a really good idea.

Some of the satellites from Harbin Institute of Technology are an example of this, since they sent their packets fragmented in a KISS stream.

Good candidate for storing full frames instead of packets. When community gets a lot of frames, then we can post-process them and extract valid packets. Another example can be PACSAT, which uses AX.25 packets for their own packets that could span across multiple AX.25 packets. Протокол Pacsat • dernasherbrezon . Another example is LRPT. They define VCDU frames with CRC:

And LRPT packets can span across several frames:

So far I propose to store:

AX.25 packets
CSP packets
MOBITEX frames
OPS-SAT - CSP packets (discard outer AX.25 packet)
AAUSAT-4 - CSP packets (discard custom outer protocol)
LILACSAT/Harbin satellites - store frames

dernasherbrezon · December 6, 2020, 3:49pm

I would ignore outer AX.25 packet. I dit it and this increased decoding rate significally. https://twitter.com/dernasherbrezon/status/1327933791063707649

the problem is when the fragments don’t have a serial number

There is little we can do. Consider stream: “1 valid fragment, 1 with CRC failed / discard, 1 valid fragment”. Without frame counter we will concatenate 1 & 3 on receiver and produce corrupted packet.

PE0SAT · December 6, 2020, 9:19pm

Good day all,

It is good to see that this subject is being addressed, so that at the end, there will be multiple solutions that providing the same structured data.

A big thanks to all involved that make this possible.

DL4PD · December 7, 2020, 8:54am

Yeah, that actually is a good plan for an OPS-SAT “Modem”. The biggest problem when ignoring any headers where the only identification of the spacecraft is located in, is, you never know if the data really belongs to that spacecraft. A good example is, for instance, the NETSAT swarm and even the QUBIKs. If there is a single frequency used for multiple satellites’ downlink and you remove the Spacecraft ID this data is turned into trash.

I don’t have an idea how this could be solved…

dernasherbrezon · December 7, 2020, 9:25am

I hope it will be naturally solved in future. Currently it is really hard to get anything decoded Additionally different satellites can employ different encoding of their data. So for OPS-SAT it is safe to remove outer AX.25 identification, because its unlikely that any other satellite is transmitting reed-somonon encoded CSP packet inside of AX.25 payload.

EA4GPZ · December 7, 2020, 9:42am

That is another far more difficult problem, and perhaps it would be good to open a new thread devoted to it.

I think there are two cases:

There is something in the RF packet that can be used to distinguish the spacecraft. The problem is if this data is thrown away as part of the decoding process. Then it can be solved by attaching the satellite ID as metadata, and the question would be how to do this in each of the applications. A challenge is that the ID may not be FEC protected, and thus we might be able to decode a packet without being able to reliably identify the satellite.
There is nothing in the RF packet that can be used to distinguish the spacecraft. Then this is a far more difficult problem. Here I’m throwing in even satellites that have the same packet format but different orbits or frequencies, since the receiver might be misconfigured (typically we identify a satellite by the time of the pass and its frequency).

All this boils down to the fact that reliably identifying a satellite given one of its packets is tricky. Mike DK3WN used to do this in his Telemetry Forwarder but I think that he recently stopped to do it, since it was impossible to do it in some cases (in particular some of the MOBITEX satellites).

DL4PD · December 7, 2020, 9:49am

I don’t want to capture this thread

Let me just add another point to that issue: incorrect chosen NORAD-ID by the observer. This also recently happens and that would also make it impossible (and it is already by some of the satellites not having a proper spacecraft identifier in their packets - this is a huge issue with CSP transmissions! Why - the heck - is there a Cubesat Space Protocol without identifiers?) to identify the source of the transmitted data!

EA4GPZ · December 7, 2020, 11:16am

This is one of the things I meant by “since the receiver might be misconfigured”.

I think that CSP doesn’t include any satellite ID because bandwidth is precious, there is no global registry of IDs for small satellites (large satellites use the CCSDS Spacecraft IDs and the SANA registry), and because it’s mainly thought with the “one satellite, one groundstation” use case in mind.

DL4PD · December 7, 2020, 11:18am

Yeah, and for commercial sectors you know that there is just your individual satellite transmitting on your allocated frequency at the time the orbit parameters indicate it visible.

But we’re on amateur bands with a lot of interfering other satellites

Edit: and my best guess why every team is encapsulating their transmissions in AX.25 frames is: you need to identify in HAM Bands. It is of no use for the teams themselves and I’ve seen a lot of Modems simply stripping those framings.

dernasherbrezon · December 7, 2020, 11:28am

Agree. That should be part of protocol specification. If satellite chose CSP without reasonable identification or AX.25 with random callsign or incorrectly implemented one of these, then we can’t really do much.

This actually raises another question: what would happen if receiver will try to send invalid data to the database? So even if we agree on the format for each satellite, who is going to enforce it? Or should it be enforced at all? Let’s assume we updated gr-satellites, jradio, satnogs, dk3wn telemetry, uz7ho soundmodem, then there will be users who use old versions of these. And the old version won’t be compatible with the proper database format.

EA4GPZ · December 7, 2020, 12:00pm

I think we should rely on the applications implementing the format correctly and on people using recent enough versions of the applications.

Checking the frames for correctness from the DB side seems a never-ending work. The problem being we need to support a myriad of satellites that do a myriad different things that allow to sanity check the frames to different degrees.

There is a “version” field in the SIDS request that could be used to identify the applications and their version correctly and from the DB side check this and reject frames that are from a too old and known-bad version. Currently gr-satellites has the version hardcoded to 1.6.6 because reasons.

PE0SAT · December 7, 2020, 12:33pm

All this boils down to the fact that reliably identifying a satellite given one of its packets is tricky. Mike DK3WN used to do this in his Telemetry Forwarder but I think that he recently stopped to do it.

I can confirm that this is the case, until resent every time a new satellite was launched and identified Mike had to build a new release. This could take some time because the object wasn’t correctly identified (NoradID) and or it wasn’t clear how to determine the identity of the spacecraft based on the telemetry being received.

With the new version, that is called GetKISS+ (a merge of GetKISS and the Telemetry forwarder) the user has to select the right object based on a TLE list and enable to forwarding (allow data transfer).

To overcome issues that could occur due TLE differences between users that use there own TLE and SatNOGS, Mike is investigating the integration between the SatNOGS TLE api and the usage within GetKISS+.

Some details on the solutions used by Mike can be found at the following link: DK3WN Windows based Telemetry decoding software

PE0SAT · December 7, 2020, 1:24pm

There is a version field defined in SIDS, in the past we used this to solve exactly this issue. I don’t now if this is still the case?

pierros · December 7, 2020, 2:56pm

Just to add my two pointers in this already wide-open and ranging discussion:

Have a look at what we are going to be implementing as a unique satellite ID https://gitlab.com/librespacefoundation/satnogs/satnogs-db/-/issues/245#note_457191210
Keep in mind in your replies/discussion/proposals that satellite <> transmitter. We need a way to also discriminate transmitters (many satellites have multiple modes or even physically multiple transmitters). Right now we have UUIDs for transmitters in DB and I believe that this is the right way to approach it. @EA4GPZ I would like to see us moving fwd with capturing all info you also identified in the issue here together with @surligas so we can have one source of truth about it in one central place. (no need for satyaml I suppose then?)

dernasherbrezon · December 7, 2020, 3:02pm

There is no such field in SatNOGS API output:

{“norad_cat_id”:43466,“transmitter”:“”,“app_source”:“sids”,“schema”:“”,“decoded”:“”,“frame”:“0180A782005F5EE2500CFD0CD20CA92066001C0001004D000300420002019B019B003D0152011A0068000C000C000B000A00000000035F5EE250006E006EFF750EBCFF7B5F5EE25000030000000100020000034B006D0071A4C5043490B42A58”,“observer”:“YC5ABK-OI09kc”,“timestamp”:“2020-09-14T03:24:00Z”}

So even if receiver send it, it won’t be possible to rely on it in the analysis.

dernasherbrezon · December 7, 2020, 3:05pm

Does it mean they can send data using different protocols via different transmitters? I know about LRPT / HRPT protocols on single weather satellite, but I thought cubesats are much simpler.