Long-term Raw I/Q data request

lk151 · April 28, 2026, 11:57am

Hi all,

I’m having fun with a small side project looking to recognise RF signals with a machine learning algorithm to optimise ground station usage for space applications. I was wondering if it were possible to have the raw I/Q data from as many observations from as many different small satellites as possible over the course of a few weeks/a month for example from a well equipped station (to give consistent quality recordings) - something similar to the EIRSAT-1 ground station.

Is there somewhere where the raw data from observations is stored? Or this is avoided due to the size of the files? Would anyone be willing to help me out with this request?

Thanks in advance!

bali · April 28, 2026, 12:37pm

open source project?

lk151 · April 28, 2026, 1:06pm

Honestly haven’t thought about it yet. We’re considering writing a paper on it but for right now we’re just some nerdy friends playing around for fun

mhuebner · April 29, 2026, 10:21am

Our stations [BEEGND-1]( SatNOGS Network - Ground Station BEEGND-1 ) and [BEEGND-4](SatNOGS Network - Ground Station BEEGND-4) upload their doppler-corrected IQ-Data from the satnogs client directly into our nextcloud. You can get them from there:

BEEGND-1: TU Berlin
BEEGND-4: TU Berlin

As our quota is limited, I regularly delete old observations there per script. Feel free to schedule and fetch the data.

bali · April 29, 2026, 12:57pm

SUPERZZZ

lk151 · April 29, 2026, 1:12pm

This is awesome, thank you so much!

tammojan · May 1, 2026, 10:52am

The Dwingeloo radio telescope recordings are also available online: SatNOGS IQ data

frovelli · May 5, 2026, 7:24am

For this kind of work, I think the raw I/Q files are only half of the dataset.
The associated metadata is just as important if the goal is to make the data reusable and comparable across stations or processing pipelines. Some public IQ archives already expose useful observation-level metadata, such as observation id, date/time, satellite, status and sometimes decoded telemetry links. That is already very helpful.

For ML-oriented reuse, though, I would still try to keep a small machine-readable manifest per recording with satellite/NORAD id, observation id if available, station id, timestamp, center frequency, sample rate, sample format, doppler correction status, pass information and any known decoder/result associated with that recording. Witout that context, the files may still be useful, but reproducing results or comparing different observations later becomes much harder.

mhuebner · May 5, 2026, 10:36am

I ever wanted the client to write real sigmf-files. Most of the metadata you mention can be added there. Plus: It’s an emerging open standard.

frovelli · May 5, 2026, 11:01am

Good point, yes. SigMF is probably the right direction here, much better than inventing an ad-hoc manifest format. The useful part would probably be a small SatNOGS-specific SigMF extension or convention for the mission/observation context that is not purely capture-level: observation id, station id, satellite/NORAD id, Doppler correction status, and possibly decoder/result links when available. That would keep the IQ recordings aligned with an existing open standard, while making them easier to reuse across ML experiments and downstream tools.

bali · May 11, 2026, 10:53am

Station 4451 :

SatNOGS IQ data

1 Tera Archieves:
https://hobisatelit.github.io/iq

mhuebner · May 13, 2026, 7:53am

I’ve updated the stations to upload .tar.zst archives with sigmf-data. Preliminarily, the metadata is just link by an url to the observations on satnogs-network. It is not that easy to get everything within a post-observation script.

{
    "global": {
        "core:author": "Op of Satnogs-Station 106",
        "core:datatype": "ci16_le",
        "core:description": "Doppler corrected IQ capture of SatNOGS-Observation https://network.satnogs.org/api/observations/14052854",
        "core:num_channels": 1,
        "core:offset": 0,
        "core:sample_rate": 57600,
        "core:sha512": "7f3f01f1f770861d703ee6b1d62ca86bfda0eaf3616e08338ddb31c5b7d1544535fef1a8d87f861a1b6ade028ee7b2e918bc624509fc8421dee5e65345edee76",
        "core:version": "1.10.0"
    },
    "captures": [
        {
            "core:datetime": "2026-05-13T02:24:44",
            "core:frequency": 437325000,
            "core:sample_start": 0
        }
    ],
    "annotations": []
}

frovelli · May 15, 2026, 1:52pm

That is a very nice step forward.

Linking the SigMF recording back to the SatNOGS observation URL already solves a large part of the provenance problem, especially if the post-observation script does not have easy access to all the context locally.

Maybe a practical split could be:

keep the station-side writer simple and robust: SigMF data, core capture metadata, checksum, observation URL;
optionally enrich the archive later from the SatNOGS Network API, using the observation URL as the key.

That second step could add SatNOGS-specific context such as observation id, station id, satellite/NORAD id, transmitter/mode, Doppler correction status, and decoder/result links if available.

This would keep the capture side clean while still allowing ML or downstream tools to work with richer metadata when needed.
F.

PE0SAT · May 15, 2026, 2:53pm

Can you share some details on how the sigmf data and meta is created.

Jan | PE0SAT

mhuebner · May 18, 2026, 2:09pm

Sure, with pleasure.

I use the sigmf-python packet. With that one, it is fairly easy to generate the correct meta-data. It even calculates your the SHA512sum of the iq-data and give you the option to validate everything.

I published the script I currently use in production here:

Please feel free to re-publish your changes, or send them to me, if you do any. I would be especially interested in integrating or using a native way for the find_baudrate.py script, which is currently borrowed from the sa2kng docker stack.

If github is easier for you, I can publish it there too.

PE0SAT · May 18, 2026, 2:22pm

Thanks for sharing this code.

I will have a look at it and see how I can integrate it.

Github would indeed be easy, at the same time I understand the choice for codeberg.org, maybe a good moment to create an account at codeberg.

Jan | PE0SAT

frovelli · May 22, 2026, 9:49am

I had a quick look at the repository.

For now I would not touch the sample-rate detection part without understanding the station setup better.

A small low-risk contribution could be documentation around sigmf_packer.py: expected SatNOGS post-script arguments, required environment variables, generated .sigmf-data/.sigmf-meta/.tar.zst archive layout, and the current find_samp_rate.py dependency.

I also noticed two small pyproject details: the package name seems to have a typo, and requests/urllib3 are imported by the script but not listed as dependencies.

That kind of cleanup may make the script easier for others to inspect or reuse, without changing the working station-side behavior.

F.

mhuebner · May 24, 2026, 10:08pm

Thank you very much for the review and your feedback.

Very good catches with the packet title typo and the missing dependencies. I just added those.

The post-script arguments are the standard ones, the satnogs client will call your script with. I’d love to have that interface somehow reformed (may have some json there), but atm it works that way.

I spend wuite some attention on the tar archive, so that it will indeed only contain the iq data and the sigmf meta file, without a top directory. I found it much cleaner and minimalistic that way.

Unfortunately, I forgot to mention, that my station runs directly, withouth the docker container, but the python app installed diretly via pip. That way, I assume that .env file and /tmp are accessible freely. Might be a bit more complex with the docker container boundaries, but shouldn’t be too difficult to adjust it accordingly.