Skip to content

xarray¤

There are two functions for working with XArray Datasets, one for converting a CDF to a DataSet, and one for going the other way. To use these you need the xarray package installed.

These will attempt to determine any ISTP Compliance, and incorporate that into the output.

cdf_to_xarray ¤

cdf_to_xarray(filename: str, to_datetime: bool = True, to_unixtime: bool = False, fillval_to_nan: bool = False) -> Dataset

This function converts CDF files into XArray Dataset Objects.

Parameters:

Name Type Description Default

filename ¤

str

The path to the CDF file to read

required

to_datetime ¤

bool

Whether or not to convert CDF_EPOCH/EPOCH_16/TT2000 to datetime64, or leave them as is

True

to_unixtime ¤

bool

Whether or not to convert CDF_EPOCH/EPOCH_16/TT2000 to unixtime, or leave them as is

False

fillval_to_nan ¤

bool

If True, any data values that match the FILLVAL attribute for a variable will be set to NaN

False

Returns:

Name Type Description
dataset Dataset

An XArray Dataset object containing all of the data and attributes from the CDF file

Example MMS
>>> # Import necessary libraries
>>> import cdflib.xarray
>>> import xarray as xr
>>> import os
>>> import urllib.request

>>> # Download a CDF file
>>> fname = 'mms2_fgm_srvy_l2_20160809_v4.47.0.cdf'
>>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/mms2_fgm_srvy_l2_20160809_v4.47.0.cdf")
>>> if not os.path.exists(fname):
>>>     urllib.request.urlretrieve(url, fname)

>>> # Load in and display the CDF file
>>> mms_data = cdflib.xarray.cdf_to_xarray("mms2_fgm_srvy_l2_20160809_v4.47.0.cdf", to_unixtime=True, fillval_to_nan=True)

>>> # Show off XArray functionality

>>> # Slice the data using built in XArray functions
>>> mms_data2 = mms_data.isel(dim0=0)
>>> # Plot the sliced data using built in XArray functions
>>> mms_data2['mms2_fgm_b_gse_srvy_l2'].plot()
>>> # Zoom in on the slices data in time using built in XArray functions
>>> mms_data3 = mms_data2.isel(Epoch=slice(716000,717000))
>>> # Plot the zoomed in sliced data using built in XArray functionality
>>> mms_data3['mms2_fgm_b_gse_srvy_l2'].plot()
Example THEMIS
>>> # Import necessary libraries
>>> import cdflib.xarray
>>> import xarray as xr
>>> import os
>>> import urllib.request

>>> # Download a CDF file
>>> fname = 'thg_l2_mag_amd_20070323_v01.cdf'
>>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/thg_l2_mag_amd_20070323_v01.cdf")
>>> if not os.path.exists(fname):
>>>     urllib.request.urlretrieve(url, fname)

>>> # Load in and display the CDF file
>>> thg_data = cdflib.xarray.cdf_to_xarray(fname, to_unixtime=True, fillval_to_nan=True)
Processing Steps
1. For each variable in the CDF file
    1. Determine the name of the dimension that spans the data "records"
        - Check if the variable itself might be a dimension
        - The DEPEND_0 likely points to the approrpiate dimensions
        - If neither of the above, we create a new dimensions named "recordX"
    2. Determine the name of the other dimensions of the variable, if they exist
        - Check if the variable name itself might be a dimension
        - The DEPEND_X probably points to the appropriate dimensions for that variable, so we check those
        - If either of the above are time varying, the code appends "_dim" to the end of the name
        - If no dimensions are found through the above checks, create a dumension named "dimX"
    3. Gather all attributes that belong to the variable
    4. Add a few attributes that enable better plotting with built-in xarray functions (name, units, etc)
    5. Optionally, convert FILLVALs to NaNs in the data
    6. Optionally, convert CDF_EPOCH/EPOCH16/TT2000 variables to unixtime or datetime
    7. Create an XArray Variable object using the dimensions determined in steps 1 and 2, the attributes from steps 3 and 4, and then the variable data
2. Gather all the Variable objects created in the first step, and separate them into data variables or coordinate variables
3. Gather all global scope attributes in the CDF file
4. Create an XArray Dataset objects with the data variables, coordinate variables, and global attributes.
Source code in cdflib/xarray/cdf_to_xarray.py
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
def cdf_to_xarray(filename: str, to_datetime: bool = True, to_unixtime: bool = False, fillval_to_nan: bool = False) -> xr.Dataset:
    """
    This function converts CDF files into XArray Dataset Objects.

    Parameters
    ----------
    filename :  str
        The path to the CDF file to read
    to_datetime : bool, optional
        Whether or not to convert CDF_EPOCH/EPOCH_16/TT2000 to datetime64, or leave them as is
    to_unixtime :  bool, optional
        Whether or not to convert CDF_EPOCH/EPOCH_16/TT2000 to unixtime, or leave them as is
    fillval_to_nan : bool, optional
        If True, any data values that match the FILLVAL attribute for a variable will be set to NaN

    Returns
    -------
    dataset : xarray.Dataset
        An XArray Dataset object containing all of the data and attributes from the CDF file

    Example MMS
    -----------
    ```python
    >>> # Import necessary libraries
    >>> import cdflib.xarray
    >>> import xarray as xr
    >>> import os
    >>> import urllib.request

    >>> # Download a CDF file
    >>> fname = 'mms2_fgm_srvy_l2_20160809_v4.47.0.cdf'
    >>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/mms2_fgm_srvy_l2_20160809_v4.47.0.cdf")
    >>> if not os.path.exists(fname):
    >>>     urllib.request.urlretrieve(url, fname)

    >>> # Load in and display the CDF file
    >>> mms_data = cdflib.xarray.cdf_to_xarray("mms2_fgm_srvy_l2_20160809_v4.47.0.cdf", to_unixtime=True, fillval_to_nan=True)

    >>> # Show off XArray functionality

    >>> # Slice the data using built in XArray functions
    >>> mms_data2 = mms_data.isel(dim0=0)
    >>> # Plot the sliced data using built in XArray functions
    >>> mms_data2['mms2_fgm_b_gse_srvy_l2'].plot()
    >>> # Zoom in on the slices data in time using built in XArray functions
    >>> mms_data3 = mms_data2.isel(Epoch=slice(716000,717000))
    >>> # Plot the zoomed in sliced data using built in XArray functionality
    >>> mms_data3['mms2_fgm_b_gse_srvy_l2'].plot()
    ```

    Example THEMIS
    --------------
    ```python
    >>> # Import necessary libraries
    >>> import cdflib.xarray
    >>> import xarray as xr
    >>> import os
    >>> import urllib.request

    >>> # Download a CDF file
    >>> fname = 'thg_l2_mag_amd_20070323_v01.cdf'
    >>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/thg_l2_mag_amd_20070323_v01.cdf")
    >>> if not os.path.exists(fname):
    >>>     urllib.request.urlretrieve(url, fname)

    >>> # Load in and display the CDF file
    >>> thg_data = cdflib.xarray.cdf_to_xarray(fname, to_unixtime=True, fillval_to_nan=True)
    ```

    Processing Steps
    ----------------
        1. For each variable in the CDF file
            1. Determine the name of the dimension that spans the data "records"
                - Check if the variable itself might be a dimension
                - The DEPEND_0 likely points to the approrpiate dimensions
                - If neither of the above, we create a new dimensions named "recordX"
            2. Determine the name of the other dimensions of the variable, if they exist
                - Check if the variable name itself might be a dimension
                - The DEPEND_X probably points to the appropriate dimensions for that variable, so we check those
                - If either of the above are time varying, the code appends "_dim" to the end of the name
                - If no dimensions are found through the above checks, create a dumension named "dimX"
            3. Gather all attributes that belong to the variable
            4. Add a few attributes that enable better plotting with built-in xarray functions (name, units, etc)
            5. Optionally, convert FILLVALs to NaNs in the data
            6. Optionally, convert CDF_EPOCH/EPOCH16/TT2000 variables to unixtime or datetime
            7. Create an XArray Variable object using the dimensions determined in steps 1 and 2, the attributes from steps 3 and 4, and then the variable data
        2. Gather all the Variable objects created in the first step, and separate them into data variables or coordinate variables
        3. Gather all global scope attributes in the CDF file
        4. Create an XArray Dataset objects with the data variables, coordinate variables, and global attributes.
    """

    if to_datetime and to_unixtime:
        to_datetime = False

    # Convert the CDF file into a series of dicts, so we don't need to keep reading the file
    global_attributes, all_variable_attributes, all_variable_data, all_variable_properties = _convert_cdf_to_dicts(
        filename, to_datetime=to_datetime, to_unixtime=to_unixtime
    )

    created_vars, depend_dimensions = _generate_xarray_data_variables(
        all_variable_data, all_variable_attributes, all_variable_properties, fillval_to_nan
    )

    label_variables = _discover_label_variables(all_variable_attributes, all_variable_properties, all_variable_data)
    uncertainty_variables = _discover_uncertainty_variables(all_variable_attributes)

    # Determine which dimensions are coordinates vs actual data
    # Variables are considered coordinates if one of the other dimensions depends on them.
    # Otherwise, they are considered data coordinates.
    created_coord_vars: Dict[str, xr.Variable] = {}
    created_data_vars: Dict[str, xr.Variable] = {}
    for var_name in created_vars:
        if var_name in label_variables:
            # If these are label variables, we'll deal with these later when the DEPEND variables come up
            continue
        elif (var_name in depend_dimensions) or (var_name + "_dim" in depend_dimensions):
            # If these are DEPEND variables, add them to the DataSet coordinates
            created_coord_vars[var_name] = created_vars[var_name]
            # Check if these coordinate variable have associated labels
            for lab in label_variables:
                if label_variables[lab] == var_name:  # Found one! Label "lab" covers the dimension "var_name"
                    if len(created_vars[lab].dims) == len(created_vars[var_name].dims):
                        if created_vars[lab].size != created_vars[var_name].size:
                            logger.warning(
                                f"Warning, label variable {lab} does not match the expected dimension sizes of {var_name}"
                            )
                        else:
                            created_vars[lab].dims = created_vars[var_name].dims
                    else:
                        if created_vars[lab].shape[0] != created_vars[var_name].shape[-1]:  # type: ignore
                            logger.warning(
                                f"Warning, label variable {lab} does not match the expected dimension sizes of {var_name}"
                            )
                        else:
                            created_vars[lab].dims = (created_vars[var_name].dims[-1],)
                    # Add the labels to the coordinates as well
                    created_coord_vars[lab] = created_vars[lab]
        elif var_name in uncertainty_variables:
            # If there is an uncertainty variable, link it to the uncertainty along a dimension
            if created_vars[var_name].size == created_vars[uncertainty_variables[var_name]].size:
                created_vars[var_name].dims = created_vars[uncertainty_variables[var_name]].dims
            created_data_vars[var_name] = created_vars[var_name]
        else:
            created_data_vars[var_name] = created_vars[var_name]

    # Check that the datasets are valid
    _verify_dimension_sizes(created_data_vars, created_coord_vars)

    # Create the XArray DataSet Object!
    return xr.Dataset(data_vars=created_data_vars, coords=created_coord_vars, attrs=global_attributes)

xarray_to_cdf ¤

xarray_to_cdf(xarray_dataset: Dataset, file_name: str, unix_time_to_cdf_time: bool = False, istp: bool = True, terminate_on_warning: bool = False, auto_fix_depends: bool = True, record_dimensions: List[str] = ['record0'], compression: int = 0, nan_to_fillval: bool = True) -> None

This function converts XArray Dataset objects into CDF files.

Parameters:

Name Type Description Default

xarray_dataset ¤

Dataset

The XArray Dataset object that you'd like to convert into a CDF file

required

file_name ¤

str

The path to the place the newly created CDF file

required

unix_time_to_cdf_time ¤

bool

Whether or not to assume variables that will become a CDF_EPOCH/EPOCH16/TT2000 are a unix timestamp

False

istp ¤

bool

Whether or not to do checks on the Dataset object to attempt to enforce CDF compliance

True

terminate_on_warning ¤

bool

Whether or not to throw an error when given warnings or to continue trying to make the file

False

auto_fix_depends ¤

bool

Whether or not to automatically add dependencies

True

record_dimensions ¤

list of str

If the code cannot determine which dimensions should be made into CDF records, you may provide a list of them here

['record0']

compression ¤

int

The level of compression to gzip the data in the variables. Default is no compression, standard is 6.

0

nan_to_fillval ¤

bool

Convert all np.nan and np.datetime64('NaT') to the standard CDF FILLVALs.

True

Returns:

Type Description
None

Function generates a CDF file

Example CDF file from scratch
>>> # Import the needed libraries
>>> from cdflib.xarray import xarray_to_cdf
>>> import xarray as xr
>>> import os
>>> import urllib.request

>>> # Create some fake data
>>> var_data = [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>>> var_dims = ['epoch', 'direction']
>>> data = xr.Variable(var_dims, var_data)

>>> # Create fake epoch data
>>> epoch_data = [1, 2, 3]
>>> epoch_dims = ['epoch']
>>> epoch = xr.Variable(epoch_dims, epoch_data)

>>> # Combine the two into an xarray Dataset and export as CDF (this will print out many ISTP warnings)
>>> ds = xr.Dataset(data_vars={'data': data, 'epoch': epoch})
>>> xarray_to_cdf(ds, 'hello.cdf')

>>> # Add some global attributes
>>> global_attributes = {'Project': 'Hail Mary',
>>>                      'Source_name': 'Thin Air',
>>>                      'Discipline': 'None',
>>>                      'Data_type': 'counts',
>>>                      'Descriptor': 'Midichlorians in unicorn blood',
>>>                      'Data_version': '3.14',
>>>                      'Logical_file_id': 'SEVENTEEN',
>>>                      'PI_name': 'Darth Vader',
>>>                      'PI_affiliation': 'Dark Side',
>>>                      'TEXT': 'AHHHHH',
>>>                      'Instrument_type': 'Banjo',
>>>                      'Mission_group': 'Impossible',
>>>                      'Logical_source': ':)',
>>>                      'Logical_source_description': ':('}

>>> # Lets add a new coordinate variable for the "direction"
>>> dir_data = [1, 2, 3]
>>> dir_dims = ['direction']
>>> direction = xr.Variable(dir_dims, dir_data)

>>> # Recreate the Dataset with this new objects, and recreate the CDF
>>> ds = xr.Dataset(data_vars={'data': data, 'epoch': epoch, 'direction':direction}, attrs=global_attributes)
>>> os.remove('hello.cdf')
>>> xarray_to_cdf(ds, 'hello.cdf')
Example netCDF -> CDF conversion
>>> # Download a netCDF file (if needed)
>>> fname = 'dn_magn-l2-hires_g17_d20211219_v1-0-1.nc'
>>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/dn_magn-l2-hires_g17_d20211219_v1-0-1.nc")
>>> if not os.path.exists(fname):
>>>     urllib.request.urlretrieve(url, fname)

>>> # Load in the dataset, and set VAR_TYPES attributes (the most important attribute as far as this code is concerned)
>>> goes_r_mag = xr.load_dataset("dn_magn-l2-hires_g17_d20211219_v1-0-1.nc")
>>> for var in goes_r_mag:
>>>     goes_r_mag[var].attrs['VAR_TYPE'] = 'data'
>>> goes_r_mag['coordinate'].attrs['VAR_TYPE'] = 'support_data'
>>> goes_r_mag['time'].attrs['VAR_TYPE'] = 'support_data'
>>> goes_r_mag['time_orbit'].attrs['VAR_TYPE'] = 'support_data'

>>> # Create the CDF file
>>> xarray_to_cdf(goes_r_mag, 'hello.cdf')
Processing Steps
1. Determines the list of dimensions that represent time-varying dimensions.  These ultimately become the "records" of the CDF file
    - If it is named "epoch" or "epoch_N", it is considered time-varying
    - If a variable points to another variable with a DEPEND_0 attribute, it is considered time-varying
    - If a variable has an attribute of VAR_TYPE equal to "data", it is time-varying
    - If a variable has an attribute of VAR_TYPE equal to "support_data" and it is 2 dimensional, it is time-varying
2. Determine a list of "dimension" variables within the Dataset object
    - These are all coordinates in the dataset that are not time-varying
    - Additionally, variables that a DEPEND_N attribute points to are also considered dimensions
3. Optionally, if ISTP=true, automatically add in DEPEND_0/1/2/etc attributes as necessary
4. Optionally, if ISTP=true, check all variable attributes and global attributes are present
5. Convert all data into either CDF_INT8, CDF_DOUBLE, CDF_UINT4, or CDF_CHAR
6. Optionally, convert variables with the name "epoch" or "epoch_N" to CDF_TT2000
7. Write all variables and global attributes to the CDF file!
ISTP Warnings
If ISTP=true, these are some of the common things it will check:

- Missing or invalid VAR_TYPE variable attributes
- DEPEND_N missing from variables
- DEPEND_N/LABL_PTR/UNIT_PTR/FORM_PTR are pointing to missing variables
- Missing required global attributes
- Conflicting global attributes
- Missing an "epoch" dimension
- DEPEND_N attribute pointing to a variable with uncompatible dimensions
CDF Data Types
All variable data is automatically converted to one of the following CDF types, based on the type of data in the xarray Dataset:

=============  ===============
Numpy type     CDF Data Type
=============  ===============
np.datetime64  CDF_TIME_TT2000
np.int8        CDF_INT1
np.int16       CDF_INT2
np.int32       CDF_INT4
np.int64       CDF_INT8
np.float16     CDF_FLOAT
np.float32     CDF_FLOAT
np.float64     CDF_DOUBLE
np.uint8       CDF_UINT1
np.uint16      CDF_UINT2
np.uint32      CDF_UINT4
np.complex_    CDF_EPOCH16
np.str_        CDF_CHAR
np.bytes_      CDF_CHAR
object         CDF_CHAR
datetime       CDF_TIME_TT2000
=============  ===============

If you want to attempt to cast your data to a different type, you need to add an attribute to your variable called "CDF_DATA_TYPE".
xarray_to_cdf will read this attribute and override the default conversions.  Valid choices are

- Integers: CDF_INT1, CDF_INT2, CDF_INT4, CDF_INT8
- Unsigned Integers: CDF_UINT1, CDF_UINT2, CDF_UINT4
- Floating Point: CDF_REAL4, CDF_FLOAT, CDF_DOUBLE, CDF_REAL8
- Time: CDF_EPOCH, CDF_EPOCH16, CDF_TIME_TT2000
Source code in cdflib/xarray/xarray_to_cdf.py
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
def xarray_to_cdf(
    xarray_dataset: xr.Dataset,
    file_name: str,
    unix_time_to_cdf_time: bool = False,
    istp: bool = True,
    terminate_on_warning: bool = False,
    auto_fix_depends: bool = True,
    record_dimensions: List[str] = ["record0"],
    compression: int = 0,
    nan_to_fillval: bool = True,
) -> None:
    """
    This function converts XArray Dataset objects into CDF files.

    Parameters
    ----------
    xarray_dataset : xarray.Dataset
        The XArray Dataset object that you'd like to convert into a CDF file
    file_name : str
        The path to the place the newly created CDF file
    unix_time_to_cdf_time : bool, optional
        Whether or not to assume variables that will become a CDF_EPOCH/EPOCH16/TT2000 are a unix timestamp
    istp : bool, optional
        Whether or not to do checks on the Dataset object to attempt to enforce CDF compliance
    terminate_on_warning : bool, optional
        Whether or not to throw an error when given warnings or to continue trying to make the file
    auto_fix_depends : bool, optional
        Whether or not to automatically add dependencies
    record_dimensions : list of str, optional
        If the code cannot determine which dimensions should be made into CDF records, you may provide a list of them here
    compression : int, optional
        The level of compression to gzip the data in the variables.  Default is no compression, standard is 6.
    nan_to_fillval : bool, optional
        Convert all np.nan and np.datetime64('NaT') to the standard CDF FILLVALs.

    Returns
    -------
    None
        Function generates a CDF file

    Example CDF file from scratch
    ------------------------------
    ```python
    >>> # Import the needed libraries
    >>> from cdflib.xarray import xarray_to_cdf
    >>> import xarray as xr
    >>> import os
    >>> import urllib.request

    >>> # Create some fake data
    >>> var_data = [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
    >>> var_dims = ['epoch', 'direction']
    >>> data = xr.Variable(var_dims, var_data)

    >>> # Create fake epoch data
    >>> epoch_data = [1, 2, 3]
    >>> epoch_dims = ['epoch']
    >>> epoch = xr.Variable(epoch_dims, epoch_data)

    >>> # Combine the two into an xarray Dataset and export as CDF (this will print out many ISTP warnings)
    >>> ds = xr.Dataset(data_vars={'data': data, 'epoch': epoch})
    >>> xarray_to_cdf(ds, 'hello.cdf')

    >>> # Add some global attributes
    >>> global_attributes = {'Project': 'Hail Mary',
    >>>                      'Source_name': 'Thin Air',
    >>>                      'Discipline': 'None',
    >>>                      'Data_type': 'counts',
    >>>                      'Descriptor': 'Midichlorians in unicorn blood',
    >>>                      'Data_version': '3.14',
    >>>                      'Logical_file_id': 'SEVENTEEN',
    >>>                      'PI_name': 'Darth Vader',
    >>>                      'PI_affiliation': 'Dark Side',
    >>>                      'TEXT': 'AHHHHH',
    >>>                      'Instrument_type': 'Banjo',
    >>>                      'Mission_group': 'Impossible',
    >>>                      'Logical_source': ':)',
    >>>                      'Logical_source_description': ':('}

    >>> # Lets add a new coordinate variable for the "direction"
    >>> dir_data = [1, 2, 3]
    >>> dir_dims = ['direction']
    >>> direction = xr.Variable(dir_dims, dir_data)

    >>> # Recreate the Dataset with this new objects, and recreate the CDF
    >>> ds = xr.Dataset(data_vars={'data': data, 'epoch': epoch, 'direction':direction}, attrs=global_attributes)
    >>> os.remove('hello.cdf')
    >>> xarray_to_cdf(ds, 'hello.cdf')
    ```

    Example netCDF -> CDF conversion
    --------------------------------
    ```python
    >>> # Download a netCDF file (if needed)
    >>> fname = 'dn_magn-l2-hires_g17_d20211219_v1-0-1.nc'
    >>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/dn_magn-l2-hires_g17_d20211219_v1-0-1.nc")
    >>> if not os.path.exists(fname):
    >>>     urllib.request.urlretrieve(url, fname)

    >>> # Load in the dataset, and set VAR_TYPES attributes (the most important attribute as far as this code is concerned)
    >>> goes_r_mag = xr.load_dataset("dn_magn-l2-hires_g17_d20211219_v1-0-1.nc")
    >>> for var in goes_r_mag:
    >>>     goes_r_mag[var].attrs['VAR_TYPE'] = 'data'
    >>> goes_r_mag['coordinate'].attrs['VAR_TYPE'] = 'support_data'
    >>> goes_r_mag['time'].attrs['VAR_TYPE'] = 'support_data'
    >>> goes_r_mag['time_orbit'].attrs['VAR_TYPE'] = 'support_data'

    >>> # Create the CDF file
    >>> xarray_to_cdf(goes_r_mag, 'hello.cdf')
    ```

    Processing Steps
    ----------------
        1. Determines the list of dimensions that represent time-varying dimensions.  These ultimately become the "records" of the CDF file
            - If it is named "epoch" or "epoch_N", it is considered time-varying
            - If a variable points to another variable with a DEPEND_0 attribute, it is considered time-varying
            - If a variable has an attribute of VAR_TYPE equal to "data", it is time-varying
            - If a variable has an attribute of VAR_TYPE equal to "support_data" and it is 2 dimensional, it is time-varying
        2. Determine a list of "dimension" variables within the Dataset object
            - These are all coordinates in the dataset that are not time-varying
            - Additionally, variables that a DEPEND_N attribute points to are also considered dimensions
        3. Optionally, if ISTP=true, automatically add in DEPEND_0/1/2/etc attributes as necessary
        4. Optionally, if ISTP=true, check all variable attributes and global attributes are present
        5. Convert all data into either CDF_INT8, CDF_DOUBLE, CDF_UINT4, or CDF_CHAR
        6. Optionally, convert variables with the name "epoch" or "epoch_N" to CDF_TT2000
        7. Write all variables and global attributes to the CDF file!

    ISTP Warnings
    -------------
        If ISTP=true, these are some of the common things it will check:

        - Missing or invalid VAR_TYPE variable attributes
        - DEPEND_N missing from variables
        - DEPEND_N/LABL_PTR/UNIT_PTR/FORM_PTR are pointing to missing variables
        - Missing required global attributes
        - Conflicting global attributes
        - Missing an "epoch" dimension
        - DEPEND_N attribute pointing to a variable with uncompatible dimensions

    CDF Data Types
    --------------
        All variable data is automatically converted to one of the following CDF types, based on the type of data in the xarray Dataset:

        =============  ===============
        Numpy type     CDF Data Type
        =============  ===============
        np.datetime64  CDF_TIME_TT2000
        np.int8        CDF_INT1
        np.int16       CDF_INT2
        np.int32       CDF_INT4
        np.int64       CDF_INT8
        np.float16     CDF_FLOAT
        np.float32     CDF_FLOAT
        np.float64     CDF_DOUBLE
        np.uint8       CDF_UINT1
        np.uint16      CDF_UINT2
        np.uint32      CDF_UINT4
        np.complex_    CDF_EPOCH16
        np.str_        CDF_CHAR
        np.bytes_      CDF_CHAR
        object         CDF_CHAR
        datetime       CDF_TIME_TT2000
        =============  ===============

        If you want to attempt to cast your data to a different type, you need to add an attribute to your variable called "CDF_DATA_TYPE".
        xarray_to_cdf will read this attribute and override the default conversions.  Valid choices are

        - Integers: CDF_INT1, CDF_INT2, CDF_INT4, CDF_INT8
        - Unsigned Integers: CDF_UINT1, CDF_UINT2, CDF_UINT4
        - Floating Point: CDF_REAL4, CDF_FLOAT, CDF_DOUBLE, CDF_REAL8
        - Time: CDF_EPOCH, CDF_EPOCH16, CDF_TIME_TT2000

    """

    if os.path.isfile(file_name):
        _warn_or_except(f"{file_name} already exists, cannot create CDF file.  Returning...", terminate_on_warning)
        return

    # Make a deep copy of the data before continuing
    dataset = xarray_dataset.copy()

    if nan_to_fillval:
        _convert_nans_to_fillval(dataset)

    if istp:
        # This checks all the variable and attribute names to ensure they are ISTP compliant.
        _validate_varatt_names(dataset, terminate_on_warning)

        # This creates a list of suspected or confirmed label variables
        _label_checker(dataset, terminate_on_warning)

        # This creates a list of suspected or confirmed dimension variables
        dim_vars = _dimension_checker(dataset, terminate_on_warning)

        # This creates a list of suspected or confirmed record variables
        depend_0_vars, time_varying_dimensions = _epoch_checker(dataset, dim_vars, terminate_on_warning)

        depend_0_vars = record_dimensions + depend_0_vars
        time_varying_dimensions = record_dimensions + time_varying_dimensions

        # After we do the first pass of checking for dimensions and record variables, lets do a second pass to make sure
        # we've got everything
        dim_vars = _recheck_dimensions_after_epoch_checker(dataset, time_varying_dimensions, dim_vars, terminate_on_warning)

        # This function will alter the attributes of the data variables if needed
        dataset = _add_depend_variables_to_dataset(
            dataset, dim_vars, depend_0_vars, time_varying_dimensions, terminate_on_warning, auto_fix_depends
        )

        _global_attribute_checker(dataset, terminate_on_warning)

        _variable_attribute_checker(dataset, depend_0_vars, terminate_on_warning)
    else:
        depend_0_vars = record_dimensions
        time_varying_dimensions = record_dimensions

    # Gather the global attributes, write them into the file
    glob_att_dict: Dict[str, Dict[int, Any]] = {}
    for ga in dataset.attrs:
        if hasattr(dataset.attrs[ga], "__iter__") and not isinstance(dataset.attrs[ga], str):
            i = 0
            glob_att_dict[ga] = {}
            for entry in dataset.attrs[ga]:
                glob_att_dict[ga][i] = entry
                i += 1
        else:
            glob_att_dict[ga] = {0: dataset.attrs[ga]}

    x = CDF(file_name)
    x.write_globalattrs(glob_att_dict)

    # Gather the variables, write them into the file
    datasets = (dataset, dataset.coords)
    for d in datasets:
        for var in d:
            var = cast(str, var)

            cdf_data_type, cdf_num_elements = _dtype_to_cdf_type(d[var])
            if cdf_data_type is None or cdf_num_elements is None:
                continue

            if len(d[var].dims) > 0:
                if var in time_varying_dimensions or var in depend_0_vars:
                    dim_sizes = list(d[var].shape[1:])
                    record_vary = True
                else:
                    dim_sizes = list(d[var].shape)
                    record_vary = False
            else:
                dim_sizes = []
                record_vary = True

            var_data = d[var].data

            cdf_epoch = False
            cdf_epoch16 = False
            if "CDF_DATA_TYPE" in d[var].attrs:
                if d[var].attrs["CDF_DATA_TYPE"] == "CDF_EPOCH":
                    cdf_epoch = True
                elif d[var].attrs["CDF_DATA_TYPE"] == "CDF_EPOCH16":
                    cdf_epoch16 = True

            if _is_datetime_array(d[var].data) or _is_datetime64_array(d[var].data):
                var_data = _datetime_to_cdf_time(d[var], cdf_epoch=cdf_epoch, cdf_epoch16=cdf_epoch16)
            elif unix_time_to_cdf_time:
                if _is_istp_epoch_variable(var) or (
                    DATATYPES_TO_STRINGS[cdf_data_type] in ("CDF_EPOCH", "CDF_EPOCH16", "CDF_TIME_TT2000")
                ):
                    var_data = _unixtime_to_cdf_time(d[var].data, cdf_epoch=cdf_epoch, cdf_epoch16=cdf_epoch16)

            # Grab the attributes from xarray, and attempt to convert VALIDMIN and VALIDMAX to the same data type as the variable
            var_att_dict = {}
            for att in d[var].attrs:
                var_att_dict[att] = d[var].attrs[att]
                if _is_datetime_array(d[var].attrs[att]) or _is_datetime64_array(d[var].attrs[att]):
                    att_data = _datetime_to_cdf_time(d[var], cdf_epoch=cdf_epoch, cdf_epoch16=cdf_epoch16, attribute_name=att)
                    var_att_dict[att] = [att_data, DATATYPES_TO_STRINGS[cdf_data_type]]
                elif unix_time_to_cdf_time:
                    if "TIME_ATTRS" in d[var].attrs:
                        if att in d[var].attrs["TIME_ATTRS"]:
                            if DATATYPES_TO_STRINGS[cdf_data_type] in ("CDF_EPOCH", "CDF_EPOCH16", "CDF_TIME_TT2000"):
                                att_data = _unixtime_to_cdf_time(d[var].attrs[att], cdf_epoch=cdf_epoch, cdf_epoch16=cdf_epoch16)
                                var_att_dict[att] = [att_data, DATATYPES_TO_STRINGS[cdf_data_type]]
                elif (att == "VALIDMIN" or att == "VALIDMAX" or att == "FILLVAL") and istp:
                    var_att_dict[att] = [d[var].attrs[att], DATATYPES_TO_STRINGS[cdf_data_type]]

            var_spec = {
                "Variable": var,
                "Data_Type": cdf_data_type,
                "Num_Elements": cdf_num_elements,
                "Rec_Vary": record_vary,
                "Dim_Sizes": dim_sizes,
                "Compress": compression,
            }

            x.write_var(var_spec, var_attrs=var_att_dict, var_data=var_data)

    x.close()

    return