|
Notes
| Dataset structure
Details of collection, processing and analysis of recordings are described in the published article. Each recording and corresponding metadata, single unit properties and quality metrics were packaged in the Neurodata Without Borders: Neurophysiology version 2.0 (NWB:N 2.0) data format using the NeuroConv Python package. A single NWB file was created from each recording. NWB files were placed in folders based on the brain area (Neocortex or Thalamus), the animal model (Rat, Mouse or Human) and the channel number (e.g., 256, 128, 64, 32 or 16). The filename of the NWB file (identifier) was constructed by concatenating the above information and the identifier of the animal (e.g., Rat01_Neocortex_256channel.nwb). In the case of multiple probe insertions being performed in a single animal, or recordings were carried out at multiple brain depths, this information was also incorporated in the NWB filename (e.g., Rat01_Thalamus_256ch_Insertion1.nwb). The original human Neuropixels recordings can be found in the Dryad repository.
The CSV file named "Animal_characteristics_" contains information about the subjects (e.g., age, weight, sex). The "Recording_characteristics" CSV file lists several useful properties for each NWB file including the file size, the duration of the recording, the cortical or thalamic location and stereotaxic coordinates, the single unit yield or the average signal-to-noise ratio of single units.
NWB file structure
Each NWB file contains several main groups which are similar to directories. The acquisition group contains the continuous wideband multichannel data (‘ElectricalSeriesRaw’) in a compressed form, as well as several parameters related to the raw data such as the measurement unit or the data conversion number. The general group contains metadata about the experiments and consists of several subgroups, related to the recording probe (‘general/devices’; ‘general/extracellular_ephys’) or the subjects of the experiments (‘general/subject’). Former subgroups carry information about the probe location (brain area and stereotaxic coordinates), while the latter contains metadata about the animal (e.g., sex, species, subject ID, or weight). Information about spike sorting and single units and corresponding data are available in the units group. For each unit, we included here the mean and standard deviation of their spike waveform on all channels, calculated from the wideband data (‘waveform_mean’; ’waveform_sd’). Several single unit properties (e.g., half_width, unit_name, peak_channel) and cluster quality metrics (e.g., isolation_distance, amplitude_cutoff, presence ratio), as well as the spike times and spike count of each unit were saved in the units group.The structure of NWB files can be explored using the HDFView software.
Interacting with the NWB files
After downloading, the structure of NWB files can be explored using the freely available HDFView software. Users can import data from NWB files offline using the PyNWB and MatNWB APIs, or using SpikeInterface. Loaded samples of the raw data have to be multiplied by a conversion number (e.g., 0.195 for multichannel recordings) to get the amplitudes in microvolts. Here we provide some examples how users can import data from NWB files using the MATLAB-based MatNWB API. Loading a short segment (20.000 samples corresponding to 1 second of data) of the raw wideband recording on all channels (e.g., 128 in the example): 1. nwb = nwbRead('Rat01_Neocortex_128channel.nwb'); 2. dataChunk=nwb.acquisition.get('ElectricalSeriesRaw').data.load([1, 1], [128, 20000]); It is important to note that TimeSeries data types in NWB files are stored with time in the first dimension and channels in the second, but dimensions are reversed in MatNWB. Loading and plotting the mean spike waveform of a specific single unit on the peak waveform channel: 3. peakChannels = nwb.units.vectordata.get('peak_waveform_channel').data.load(); 4. meanWaveforms = nwb.units.vectordata.get('mean_waveform_all_channels_filt').data.load(); 5. mySingleUnit = 11; 6. singleUnitWaveform = meanWaveforms(peakChannels(mySingleUnit), :, mySingleUnit); 7. plot(singleUnitWaveform); Loading the isolation distance quality metric of all units found in a single NWB file: 8. IDvalues = nwb.units.vectordata.get('isolation_distance').data.load(); Spike times are stored in a special structure called ragged arrays consisting of two vectors. The spike_times vector contains all spike times of all single units concatenated one after the other, while the spike_times_index vector stores where the spike times of individual single units are located in the spike_index vector (see also https://neurodatawithoutborders.github.io/matnwb/tutorials/html/ecephys.html#H_97F533F8). We can load the spikes times of a specific single unit (in seconds) the following way: 9. allSpikeTimes = nwb.units.spike_times.data.load(); 10. spikeTimesIndex = nwb.units.spike_times_index.data.load(); 11. spikesOfSingleUnit2 = allSpikeTimes(spikeTimesIndex(1)+1 : spikeTimesIndex(2)); SpikeInterface can also be used to load the wideband data and single unit properties (in Python, works only with version 0.13): 1. import spikeextractors as se 2. nwbPath = 'Rat01_Neocortex_128channel' 3. recording = se.NwbRecordingExtractor(nwbPath) 4. sorting = se.NwbSortingExtractor(nwbPath) 5. mySingleUnit = 2 6. sorting.get_unit_property(mySingleUnit,'isolation_distance') We also provide a Python script ("NWB_tutorial_script.py") which can be used to load, visualize and preprocess (e.g., filter) files in the dataset using pyNWB. |