Notes
| Repository structure
Each recording and corresponding metadata, single unit properties and quality metrics were packaged in the Neurodata Without Borders: Neurophysiology version 2.0 (NWB:N 2.0) data format using the MatNWB API. A single NWB file was created from each recording. NWB files were placed in folders based on the identifier of the animal (from Rat01 to Rat20), probe insertion sequence (Insertion1 or Insertion2) and cortical depth (from Depth1 to Depth3). The filename of the NWB file (identifier) was constructed by concatenating the above information (e.g., Rat01_Insertion2_Depth3).
The CSV file named "Animal_characteristics_and_targeted_cortical_areas" contains information about the weight and sex of animals, as well as about the cortical area and stereotaxic coordinates corresponding to each probe insertion. The "Recording_characteristics" CSV file lists several useful properties for each NWB file including the file size, the duration of the recording, the cortical location, the single unit yield, the average signal-to-noise ratio of single units, the degree of power line (50 Hz) noise contamination (given by the power spectral density value measured at 50 Hz, and also by the ratio of the power spectral density measured at 50 Hz to the power spectral density measured at 49 Hz), the RMS noise and RMS signal levels.
NWB file structure
Each NWB file contains several main groups which are similar to directories. The acquisition group contains the continuous wideband 128-channel data (‘wideband_multichannel_recording’) in a compressed form, as well as several parameters related to the raw data such as the measurement unit or the data conversion number. The general group contains metadata about the experiments and consists of several subgroups, related to the recording probe (‘general/devices’; ‘general/extracellular_ephys’) or the subjects of the experiments (‘general/subject’). Former subgroups carry information about the probe location (brain area and stereotaxic coordinates) and the relative positions and laminar location of recordings sites, while the latter contains metadata about the animal (e.g., sex, species, subject ID, or weight). Information about spike sorting and single units and corresponding data are available in the units group. For each unit, we included here the mean and standard deviation of their spike waveform on all channels, calculated both from the filtered (‘mean_waveform_all_channels_filt’; ’waveform_sd_all_channels_filt’) and the wideband data (‘mean_waveform_all_channels_raw’; ’ waveform_sd_all_channels_raw’). For an easier visualization of the multichannel spike waveform in two dimensions, we have also added an array which contains the mean spike waveform in the 32 x 4 shape of the electrode array (‘mean_waveform_all_channels_filt_32x4’; ’mean_waveform_all_channels_raw_32x4’). Furthermore, the spike waveform recorded on the channel with the largest spike (i.e., peak waveform channel) was saved separately (‘mean_waveform_peak_channel_filt’, ‘mean_waveform_peak_channel_raw’). Several single unit properties and cluster quality metrics, as well as the spike times and spike count of each unit were saved in the units group. Furthermore, to aid users in selecting and analyzing a subset of this dataset appropriate for their research goals, we also created an NWB file (‘allSingleUnits.nwb’) which contains all single units with all the properties listed above, along with the identifier of the recording (‘units/session_id’) and the cortical area (‘units/cortical_area’) they originate from. The structure of NWB files can be explored using the HDFView software.
Interacting with the NWB files
Users can import data from NWB files using the PyNWB and MatNWB APIs, or using SpikeInterface. Loaded samples of the raw data have to be multiplied by a conversion number (0.195) to get the amplitudes in microvolts. Here we provide some examples how users can import data from NWB files using the MATLAB-based MatNWB API. Loading a short segment (20.000 samples corresponding to 1 second of data) of the raw wideband recording on all (128) channels: 1. nwb = nwbRead('Rat01_Insertion1_Depth1.nwb'); 2. dataChunk=nwb.acquisition.get('wideband_multichannel_recording').data.load([1, 1], [128, 20000]); It is important to note that TimeSeries data types in NWB files are stored with time in the first dimension and channels in the second, but dimensions are reversed in MatNWB. Loading and plotting the mean spike waveform of a specific single unit on the peak waveform channel: 3. peakChannels = nwb.units.vectordata.get('peak_waveform_channel').data.load(); 4. meanWaveforms = nwb.units.vectordata.get('mean_waveform_all_channels_filt').data.load(); 5. mySingleUnit = 11; 6. singleUnitWaveform = meanWaveforms(peakChannels(mySingleUnit), :, mySingleUnit); 7. plot(singleUnitWaveform); Loading the isolation distance quality metric of all units found in a single NWB file: 8. IDvalues = nwb.units.vectordata.get('isolation_distance').data.load(); Spike times are stored in a special structure called ragged arrays consisting of two vectors. The spike_times vector contains all spike times of all single units concatenated one after the other, while the spike_times_index vector stores where the spike times of individual single units are located in the spike_index vector (see also https://neurodatawithoutborders.github.io/matnwb/tutorials/html/ecephys.html#H_97F533F8). We can load the spikes times of a specific single unit (in seconds) the following way: 9. allSpikeTimes = nwb.units.spike_times.data.load(); 10. spikeTimesIndex = nwb.units.spike_times_index.data.load(); 11. spikesOfSingleUnit2 = allSpikeTimes(spikeTimesIndex(1)+1 : spikeTimesIndex(2)); SpikeInterface can also be used to load the wideband data and single unit properties (in Python, works only with version 0.13): 1. import spikeextractors as se 2. nwbPath = 'Rat01_Insertion1_Depth1.nwb' 3. recording = se.NwbRecordingExtractor(nwbPath) 4. sorting = se.NwbSortingExtractor(nwbPath) 5. mySingleUnit = 2 6. sorting.get_unit_property(mySingleUnit,'isolation_distance') The dataset is also available at the G-Node GIN repository under the following link and DOI: https://gin.g-node.org/UlbertLab/High_Resolution_Cortical_Spikes https://doi.gin.g-node.org/10.12751/g-node.arf7ol/
We also provide a Matlab script ("NWB_tutorial_script.m") which can be used to load, visualize and preprocess (e.g., filter) files in the dataset using MatNWB. |