SatNOGS Artifacts definition

pierros · December 3, 2018, 10:51am

Intro

An idea that we have been having for a while is to produce a definition for all byproducts of an observation. The concept would be to be able to detach an observation from its byproducts (waterfall, audio, data etc) and make sure that those byproducts (here-aft called Artifacts) are self described and atomic to the extend possible.

Advantages

By proceeding with such change we will be achieving the following:

Artifacts will be able to not be associated to a SatNOGS observation thus produced and submitted in an independent manner if needed (not necessarily by satnogs-client and gr-satnogs), thus enabling more submissions of data from stations not in SatNOGS Network. You may consider this an extension to the sids protocol (that is currently lacking all the info we need).
We would have a flat abstract definition of all artifacts, that will enable us to deal with them in a unified way (vs the way we expect products of an observation now). As a side effect to that our databases will be able to handle the data more gracefully.
By having an abstract definition we are future-proofing our architecture in case new types of Artifacts come along
It will be easier to process those artifacts by researchers and people that would like to do analysis (not having to traverse the relationships with observations to obtain more info)

Considerations

Such an implementation could yield the possible considerations:

We need to decide on what to keep as attributes of observations
We need to minimize to the extend possible the duplication of data between observations and artifacts (some is expected)
Such an approach would require some considerable development and coordination between satnogs-client and satnogs-network projects (and down the line db and warehouse too), so it needs careful planning and input from all stakeholders

Draft definition

Below you can find the proposed draft definition. Please add any comments directly there or broader discussion points in this thread.

Cheers!

pierros · December 3, 2018, 11:00am

Pinging a couple of key stakeholders here for input:
@surligas, @kerel, @cshields, @DK3WN, @EA4GPZ, @fredy, @Acinonyx, @DL4PD, @csete, @cgbsat

EA4GPZ · December 6, 2018, 11:16am

Hi pierros, this looks very interesting. Some remarks from me.

I think that the main motivation for getting this Artifacts thing done is so that we can have a storage of “scientific” observations of satellites. In this way, SatNOGS DB would become a library of data transmitted by satellites (not only the binary data per se, but also some other by-products such as the waterfall). Thus, our aim would be that this data can be widely used in many contexts and captures most details of the observation (as practicality allows).

Regarding some of the specific by-products you mention, for the waterfall, a PNG file or similar is no good for scientific use. We should store a matrix of power-spectral density readings (think a list of FFTs, with the modulus squared taken). This needs to be accompanied by metadata regarding the signal processing used to obtain the waterfall, such as what is the frequency and time resolution (FFT bin width, and signal timespan needed to produce a single spectrum line), averaging, FFT window, some form of timestamping, etc.

Regarding audio, I have already discussed that an OGG file of FM or SSB demodulated audio is no good for many uses, but I understand the concerns about bandwidth. In any case, an OGG file is better than nothing.

An alternative/complement to audio that I have already proposed and that I think it is interesting to explore in the context of Artifacts is a Bitstream. This is the output of an FSK or PSK demodulator that has been running during the whole observation. The format is one bit per symbol, so the bandwidth is not large (9.6kbit/s for a 9k6 modem). It allows collecting all the data from the observation by knowing only the specifics of the modulation, without knowing anything about coding. This data can be used then to obtain the packets by using the correct modem. Using one bit per symbol only allows hard decoding for FEC. To allow soft decoding we could even use a small number of bits per symbol (something between 4 and 8), with the obvious increase in bandwidth needed. Of course not all modulations are suited to this idea, but most of the modems used today in small satellites are.

Regarding binary data, understood as Packets, it is very important to reach a consistent convention about what constitutes a Packet. This is also an issue with the current SatNOGS DB. For AX.25, it is common convention that a Packet is an AX.25 frame, with the CRC-16 stripped. For other modems and ad-hoc protocols we don’t have such a strong convention, so in gr-satellites I’m usually taking the approach I think it’s best when submitting data from these modems to the SatNOGS DB. Usually I consider that a Packet has been FEC-decoded and perhaps stripped of all CRC’s, but there are some corner cases where it is not so clear what to do.

Finally, and linking this to Daina Bouquin’s pitch at OSCW18 (which I found very inspiring), I think we should get some input from librarians about which metadata we should store. After all, this is supposed to be a long-lasting scientific library, and I can think of many metadata regarding the groundstation, antenna pointing and whatnot, but I think it’s best to try to ask the people that do this for a living.

pierros · December 6, 2018, 4:33pm

This is exactly our motivation and the reasoning behind the artifacts proposal.

Completely agreed and we are already moving towards this direction. Essentially WFs will be matrices that we can render and manipulate via JS on the browser (or download and use for other applications)

Those should be artifact-type (in this case WF) specific information. We should have such definitions and attributes for all different artifacts. As a matter of fact those should all be documented and version controlled in a public repo (with kaitai structs to describe and verify them @DL4PD? )

There should be multiple artifact options for that (audio) too. OGG, WAV or even FLAC.

I might be missing something but this should be case for most of our demodulators. I am aware in some cases we go a step beyond (CRC checking, AX.25 deframing etc) but we might want to reconsider those (or again have them as separate artifacts)

Completely agreed, that’s why we desperately need the artifacts definitions

I am in contact with Daina and her team to ensure we will be including them on not only artifact definitions but also satellite mission specific details (transmitters etc).

DL4PD · December 6, 2018, 4:43pm

Nowadays this would be an easy task

EA4GPZ · December 9, 2018, 12:20pm

Just to clarify, currently in the SatNOGS observations and DB we have “Data” and “Frames”. These are not what I am proposing as Bitstream. A Frame requires some knowledge of the coding in use: at least the marks for “here starts a packet” and “here the packet ends”. So you get a list of packets where the satellite transmitted information, and presumably the satellite didn’t transmit during the rest of the pass.

Of course you need not check CRCs, decode FEC or do any extra stuff for these Frames. You just need to know enough about the protocol to detect packet boundaries. However, on the one hand some modems will produce a lot of false Frames if CRC isn’t checked, and on the other hand, for some modems you need to do a relatively high amount of work until you get to packet boundaries (descramble, Viterbi, etc.).

What I am proposing as Bitstream is to have a demodulator (think a 9k6 FSK demodulator, for example) running during the whole observation. The Bitstream is the complete output of this demodulator. Then some software that knows about the modem can grab the bitstream, detect packet boundaries, throw away the rest, and even do more work such as FEC and CRCs.

Of course, Bitstreams and Frames are not mutually exclusive concepts. They just represent different degrees of moving functionality between the client and server, with the obvious advantages and disadvantages in this approach.

The idea that led me to think about Bitstreams was some comments about the difficulty of pushing new modem implementations to all the SatNOGS clients. As you can see in gr-satellites, the number of satellites doing ad-hoc things is ever increasing. Bitstreams allow this complexity to be handled in the server.

EA4GPZ · December 9, 2018, 12:35pm

Also, a couple of minor things that I think it is worth incorporating into the discussion:

Telemetry submission to other servers. Currently there are some projects that have their own servers: FUNcube, FOX, PW-Sat2 and the Harbin Institute of Technology people, off the top of my head. Their servers use different but simple protocols for submission. It would be interesting to consider some kind “hook” that submits to these servers when the appropriate Artifacts are submitted into the SatNOGS DB. In any case, the question here is broader: what to do about these other servers?
Telemetry parsing (understood as picking a binary telemetry frame and obtaining the values of the fields, in units such as volts, milliamps, etc). This could quickly turn very complicated, due to the many ad-hoc formats used (see DK3WN’s decoder programs). However, it would be interesting to consider this kind of data as another Artifact. Ideally, the DB should be able to solve queries such as “give me the solar panel voltage of this particular satellite during this particular time span”, and graph the answers to these queries. Some things along these lines (at least the graphs) are done by the FUNcube, Harbin and PW-Sat2 people.

cshields · December 12, 2018, 1:03am

We have a conversation going about this. I spoke with Chris Thompson (foxtelem) last month at the AMSAT Symposium and he agreed to write whatever endpoint needed to accept packets. We have the bridge from satnogs-network to satnogs-db done using SiDS, it wouldn’t be too difficult to extend this to an owner database as well, the trickier question will be when we want to do this. I think for AMSAT/AMSAT-UK that’s a no brainer.