Solutions

Comma separated files

  • Read in the CSV file, skipping the first 6 rows, using whitespace to separate data, invalid data -99.9 and -99.99:
import pandas as pd

weather_csv = 'cetml1659on.dat'
weather_df = pd.read_csv(weather_csv,
                         skiprows=6,
                         sep='\s+',
                         na_values=['-99.9', '-99.99']
                        )
print(weather_df.head())

Output:

JAN  FEB  MAR  APR   MAY   JUN   JUL   AUG   SEP   OCT  NOV  DEC  YEAR
1659  3.0  4.0  6.0  7.0  11.0  13.0  16.0  16.0  13.0  10.0  5.0  2.0  8.87
1660  0.0  4.0  6.0  9.0  11.0  14.0  15.0  16.0  13.0  10.0  6.0  5.0  9.10
1661  5.0  5.0  6.0  8.0  11.0  14.0  15.0  15.0  13.0  11.0  8.0  6.0  9.78
1662  5.0  6.0  6.0  8.0  11.0  15.0  15.0  15.0  13.0  11.0  6.0  3.0  9.52
1663  1.0  1.0  5.0  7.0  10.0  14.0  15.0  15.0  13.0  10.0  7.0  5.0  8.63
  • Select all data in the January column less that 0, use len() so we don't have to count the rows ourself.
weather_df[weather_df['JAN'] < 0] # Would output all the entries
len(weather_df[weather_df['JAN'] < 0]) # Just counts the number of rows

Output:

20
  • The average of the data can be found using the .mean() method:
weather_df['JUN'].mean()

Output:

14.325977653631282

.