Re: [weewx-user] Re: weeimport a smattering of missing records

gjr80 Sat, 22 Mar 2025 02:02:11 -0700

Let's take it from the top. Please post the csv data (preferably the file) 
and import config file that caused the original error. Unless you are 
talking about a fairly obvious error it is extremely difficult (and time 
consuming) to work through a failed weectl import session without those two 
pieces of the puzzle.


Gary
On Saturday, 22 March 2025 at 09:12:47 UTC+10 Andrew McGinnis wrote:

> So odd thing happens when I try to weectl --import.  
>
> I dumped the data from the database I want to take it from (as well as the 
> production one) via 
> sudo sqlite3 -header -csv /var/lib/weewx/weewx.sdb 'select * from archive 
> where datetime >= 1742443200 and datetime <= 1742529600;' > 20th_pi4b8.csv 
>
> I copied both over to my PC and ran both of the output files through Excel 
> to reconcile between the two, saving out a single 20th.csv with the data 
> cleaned up as I want it to be.  The schema/header and values are laid out 
> exactly as my database schema is.  The values are unmodified from the 
> sqlite output, beyond copy/paste in Excel.  
>
> Copied that cleaned 20th.csv back to the production weewx pi, and ran:
>
> pi@pi4b8:~/weeimport $ weectl import 
> --import-config=/home/pi/weeimport/20th.conf --update --dry-run
> Using configuration file /etc/weewx/weewx.conf
> This is a dry run. Nothing will actually be done.
> Starting weectl import...
> A CSV import from source file '/home/pi/weeimport/20th.csv' has been 
> requested.
> Using database binding 'wx_binding', which is bound to database 'weewx.sdb'
> Destination table 'archive' unit system is '0x01' (US).
> Imported records will overwrite existing database records.
> All WeeWX UV fields will be set to None.
> All WeeWX radiation fields will be set to None.
> This is a dry run, imported data will not be saved to archive.
> Starting dry run import ...
> Traceback (most recent call last):
>   File "/usr/share/weewx/weeimport/weeimport.py", line 856, in map_raw_data
>     _value = float(_row[self.map[_field]['source_field']].strip())
> ValueError: could not convert string to float: ''
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "/usr/share/weewx/weectl.py", line 75, in <module>
>     main()
>   File "/usr/share/weewx/weectl.py", line 67, in main
>     namespace.func(namespace)
>   File "/usr/share/weewx/weectllib/__init__.py", line 90, in dispatch
>     namespace.action_func(config_dict, namespace)
>   File "/usr/share/weewx/weectllib/import_cmd.py", line 85, in import_func
>     weectllib.import_actions.obs_import(config_dict,
>   File "/usr/share/weewx/weectllib/import_actions.py", line 58, in 
> obs_import
>     source_obj.run()
>   File "/usr/share/weewx/weeimport/weeimport.py", line 406, in run
>     _mapped_data = self.map_raw_data(_raw_data, self.archive_unit_sys)
>   File "/usr/share/weewx/weeimport/weeimport.py", line 904, in map_raw_data
>     self.map[_field]['unit'] == 'degree_compass':
>   File "/usr/lib/python3/dist-packages/configobj.py", line 554, in 
> __getitem__
>     val = dict.__getitem__(self, key)
> KeyError: 'unit'
>
> I looked at the file in nano, and noticed it was saved in DOS format.  I 
> changed it to be not-DOS/Mac (...native then I guess?), but same result. 
>
> If I go in through 
> pi@pi4b8:~/weeimport $ sudo sqlite3 /var/lib/weewx/weewx.sdb 
> SQLite version 3.34.1 2021-01-20 14:10:07
> Enter ".help" for usage hints.
> sqlite> DELETE FROM archive WHERE datetime >= 1742443200 and datetime <= 
> 1742529600;
> sqlite> .mode csv archive
> sqlite> .import 20th.csv archive
> 20th.csv:1: INSERT failed: datatype mismatch
> 20th.csv:1443: INSERT failed: datatype mismatch
> sqlite> .quit
>
> The data is in there though (and in the correct columns- no offset 
> errors). Doing weectl database --rebuild-daily --date=2025-03-20 rebuilds 
> successfully.  That said, my graphs look... off, where the data is highly 
> variable in it's 10-min interval that the graph uses for plot points. You 
> can see it's very smooth once I'm back to using directly captured data.
> Current (trailing 24hr):
> [image: image.png] [image: image.png]
> Yesterday (00:00-23:59)
> [image: image.png] [image: image.png]
>
> The sqlite output for '*select datetime,outTemp from archive...;*' for 
> Mar-20 13:30-15:30 (imported data) and Mar-21 13:30-15:30 (direct) have a 
> similar number of missing records for the outTemp and the 
> extraTemp/extraHumid observations, but the imported data has '
> *epochtime,""*' in a csv output for the blank records, where the direct's 
> csv output is just '*epochtime,'*
>
> The graphs are being generated (with a 10-minute resolution) very 
> differently. If I take a 10-min period of records:
> -the graph point for the imported data is the sum of the 10 records, 
> divided by 10, even though 8 were actually recorded- averaging in every 
> blank as a 0
> -the graph point for my direct observations, the same period also had 
> blanks, but the average was taken of the records present.
>
> I imagine everything in the background is likewise factoring what are 
> meant to be blanks, as zeros. 
>
> So, the sqlite import added "" for blank records to the database it would 
> seem.  Before I go even more leeroy jenkins on my database, what do I need 
> to do? Go do it again, but add/insert *None* for every blank observation? 
> add/insert *""* for every blank? Or better, what am I not seeing/missing 
> in my weectl import that it's erroring out?
>
> Here is the first two lines of the csv file I'm trying to import for 
> context:
>
> pi@pi4b8:~/weeimport $ head -2 20th.csv 
>
> dateTime,usUnits,interval,altimeter,appTemp,appTemp1,barometer,batteryStatus1,batteryStatus2,batteryStatus3,batteryStatus4,batteryStatus5,batteryStatus6,batteryStatus7,batteryStatus8,cloudbase,co,co2,consBatteryVoltage,dewpoint,dewpoint1,ET,extraHumid1,extraHumid2,extraHumid3,extraHumid4,extraHumid5,extraHumid6,extraHumid7,extraHumid8,extraTemp1,extraTemp2,extraTemp3,extraTemp4,extraTemp5,extraTemp6,extraTemp7,extraTemp8,forecast,hail,hailBatteryStatus,hailRate,heatindex,heatindex1,heatingTemp,heatingVoltage,humidex,humidex1,inDewpoint,inHumidity,inTemp,inTempBatteryStatus,leafTemp1,leafTemp2,leafWet1,leafWet2,lightning_distance,lightning_disturber_count,lightning_energy,lightning_noise_count,lightning_strike_count,luminosity,maxSolarRad,nh3,no2,noise,o3,outHumidity,outTemp,outTempBatteryStatus,pb,pm10_0,pm1_0,pm2_5,pressure,radiation,rain,rainBatteryStatus,rainRate,referenceVoltage,rxCheckPercent,signal1,signal2,signal3,signal4,signal5,signal6,signal7,signal8,snow,snowBatteryStatus,snowDepth,snowMoisture,snowRate,so2,soilMoist1,soilMoist2,soilMoist3,soilMoist4,soilTemp1,soilTemp2,soilTemp3,soilTemp4,supplyVoltage,txBatteryStatus,UV,uvBatteryStatus,windBatteryStatus,windchill,windDir,windGust,windGustDir,windrun,windSpeed,pm4_0,pm4_0out,pm2_5out,pm1_0out,pm10_0out,co2out
>
> 1742443200,1,1,30.13063837,62.49347962,,30.12772479,0,1,1,1,1,,,,604.5454057,,1106,1500,58.65679821,,,59,57,58,40,51,,,,72.61,72.37,72.47727273,78.30909091,76.82,,,,,,,,61.1239978,,,,67.94440487,,56.89523489,54,74.66,1,,,,,,,,,,,0,,,,,92,60.999998,1,,2.8,0.6,1.6,30.04971646,,0,0.88889,0,,,,,,,,,,,20.12999882,1,,,,,45.88235294,,,,70.7,,,,,1,,,1,60.999998,202.5,2.678862144,202.5,0.038218048,2.293082881,2.4,,2,,2,
>
> On Sun, Mar 16, 2025 at 4:27 PM gjr80 <gjrod...@gmail.com> wrote:
>
>> That should be fine. No need for 'empty' fields to be included in the 
>> header line and data lines, but in this case it is probably the 
>> simplest/quickest approach. Just be very careful that you don't end up with 
>> any missing or extra commas as you will likely then put the wrong data in 
>> the wrong field.
>>
>> Gary
>> On Sunday, 16 March 2025 at 05:57:55 UTC+10 Andrew McGinnis wrote:
>>
>>> Via a dump of every column in the table that exists, whether I'm using 
>>> it or not, the csv content could be:
>>>
>>>
>>> "dateTime","usUnits","interval","altimeter","appTemp","appTemp1","barometer","batteryStatus1","batteryStatus2","batteryStatus3","batteryStatus4","batteryStatus5","batteryStatus6","batteryStatus7","batteryStatus8","cloudbase","co","co2","consBatteryVoltage","dewpoint","dewpoint1","ET","extraHumid1","extraHumid2","extraHumid3","extraHumid4","extraHumid5","extraHumid6","extraHumid7","extraHumid8","extraTemp1","extraTemp2","extraTemp3","extraTemp4","extraTemp5","extraTemp6","extraTemp7","extraTemp8","forecast","hail","hailBatteryStatus","hailRate","heatindex","heatindex1","heatingTemp","heatingVoltage","humidex","humidex1","inDewpoint","inHumidity","inTemp","inTempBatteryStatus","leafTemp1","leafTemp2","leafWet1","leafWet2","lightning_distance","lightning_disturber_count","lightning_energy","lightning_noise_count","lightning_strike_count","luminosity","maxSolarRad","nh3","no2","noise","o3","outHumidity","outTemp","outTempBatteryStatus","pb","pm10_0","pm1_0","pm2_5","pressure","radiation","rain","rainBatteryStatus","rainRate","referenceVoltage","rxCheckPercent","signal1","signal2","signal3","signal4","signal5","signal6","signal7","signal8","snow","snowBatteryStatus","snowDepth","snowMoisture","snowRate","so2","soilMoist1","soilMoist2","soilMoist3","soilMoist4","soilTemp1","soilTemp2","soilTemp3","soilTemp4","supplyVoltage","txBatteryStatus","UV","uvBatteryStatus","windBatteryStatus","windchill","windDir","windGust","windGustDir","windrun","windSpeed","pm4_0","pm4_0out","pm2_5out","pm1_0out","pm10_0out","co2out"
>>>
>>> 1741612440,1,1,,,,,1.0,,,1.0,,,,,,,,,,,,58.0,,,58.0,,,,,75.02,,,77.36,,,,,,,,,,,,,,,,,,,,,,,,,,,,,238.51020260006104,,,,,,,,,,,,,
>>> 0.00999881889763705
>>> ,,0.0199976377952741,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
>>>
>>> where the larger font size one is the rain amount that I'm inserting. 
>>> Then one observation line for each record period that I need to change 
>>> (that way, no change is required to the column name line for periods that 
>>> have a different mix of observations with data)?
>>>
>>> Only 8 observations got recorded in that period I have above (I have 
>>> more than 8 XXXbatteryStatus observations alone), but at this point I'm 
>>> just seeking to add the rain observation.
>>>
>>> On Fri, Mar 14, 2025 at 9:41 PM gjr80 <gjrod...@gmail.com> wrote:
>>>
>>>> See comments below.
>>>>
>>>> Gary
>>>>
>>>> On Saturday, 15 March 2025 at 03:47:33 UTC+10 Andrew McGinnis wrote:
>>>>
>>>> If you want to use weectl import to import data into existing archive 
>>>> records, each record in your import source data (in your case your CSV 
>>>> data) needs to include the existing data for all fields as well as the 
>>>> updated rainfall data. You then need to use the (as yet undocumented) 
>>>> --update command line option when running weectl import. 
>>>>
>>>>
>>>> So rather than just the records I'm *missing*, I need to go from the 
>>>> first missing record to the last missing record, inclusive of all 
>>>> in-between records? I can easily do that, just not 100% on whether it's 
>>>> required. 
>>>>
>>>>
>>>> No, you don't need 'in between records'. Your import data should only 
>>>> include the records you intend to update/alter, you do need the data from 
>>>> the other fields of the records you are updating/importing. Perhaps an 
>>>> example might make it clearer. Say your archive table contains the 
>>>> following:
>>>>
>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>> 1742000400  23.1     20.1     1.0    80
>>>> 1742000100  23.2     20.1     2.0    81
>>>> 1741999800  23.2     20.1     5.0    84
>>>> 1741999500  23.3     20.2     0.0    83
>>>> 1741999200  23.5     20.2     0.8    83
>>>> 1741998900  23.5     20.2     0.6    84
>>>> 1741998600  23.4     20.1     0.6    82
>>>> 1741998300  23.3     20.1     0.8    83
>>>> 1741998000  23.3     20.2     0.0    84
>>>> 1741997700  23.3     20.1     1.0    82
>>>>
>>>> Lets say you want to update the rain value in records timestamped 
>>>> 1741999500 and 1741998000 (ie the 0.0 values) and your new rain values 
>>>> (per archive period values) for these two records are 0.2 and 0.4 
>>>> respectively. So you might start with your import data as follows:
>>>>
>>>> dateTime      rain
>>>> 1741999500     0.2
>>>> 1741998000     0.4
>>>>
>>>> If you import this data using the --update command line option (or 
>>>> without the --update command line option if you deleted the two 
>>>> archive table records concerned) your archive table will be:
>>>>
>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>> 1742000400  23.1     20.1     1.0    80
>>>> 1742000100  23.2     20.1     2.0    81
>>>> 1741999800  23.2     20.1     5.0    84
>>>> 1741999500                    0.2    
>>>> 1741999200  23.5     20.2     0.8    83
>>>> 1741998900  23.5     20.2     0.6    84
>>>> 1741998600  23.4     20.1     0.6    82
>>>> 1741998300  23.3     20.1     0.8    83
>>>> 1741998000                    0.4    
>>>> 1741997700  23.3     20.1     1.0    82
>>>>
>>>> Each row in your import data is considered to be an archive record in 
>>>> itself and any fields that are not included in your import data will be 
>>>> empty. To keep the rest of the existing data in the records being updated 
>>>> (in this case outTemp, inTemp and outHumidity), your import data needs 
>>>> to include values for outTemp, inTemp and outHumidity. Your import 
>>>> data would be:
>>>>
>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>> 1741999500  23.3     20.2     0.2    83
>>>> 1741998000  23.3     20.2     0.4    84
>>>>
>>>> Importing this data will give you:
>>>>
>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>> 1742000400  23.1     20.1     1.0    80
>>>> 1742000100  23.2     20.1     2.0    81
>>>> 1741999800  23.2     20.1     5.0    84
>>>> 1741999500  23.3     20.2     0.2    83
>>>> 1741999200  23.5     20.2     0.8    83
>>>> 1741998900  23.5     20.2     0.6    84
>>>> 1741998600  23.4     20.1     0.6    82
>>>> 1741998300  23.3     20.1     0.8    83
>>>> 1741998000  23.3     20.2     0.4    84
>>>> 1741997700  23.3     20.1     1.0    82
>>>>
>>>> Which I suspect is the desired result. Of course this is a very simple 
>>>> and unrealistic example but hopefully you get the idea.
>>>>  
>>>>
>>>>
>>>> Alternatively, you could avoid using the --update command line option 
>>>> by first deleting the records to be imported from the WeeWX database and 
>>>> then importing your merged/updated CSV data.
>>>>
>>>>
>>>> If you'll save me the research, how do I delete specific 
>>>> records/periods, either a single period record, or a range? I just 
>>>> recently 
>>>> noticed that the soilMoist1 observation, on my weewx testing 
>>>> install, recorded about 84 hours of very wrong values, that my production 
>>>> weewx install didn't (both getting the data from the same mqtt topic). I 
>>>> know how to drop entire observation columns via weectl database 
>>>> drop-columns NAME, but in this case it's just range of periods within 
>>>> the column that need purging. Or is it as simple as, zeroing out the 
>>>> values 
>>>> in a range via sqlite> update archive set soilMoist1=0.0 where 
>>>> dateTime > {startepochtime} and dateTime <= {endepochtime]; ?
>>>>
>>>>
>>>>  
>>>>  In terms of deleting records I use the sqlite3 utility to allow me to 
>>>> use SQL commands to interact with the database. Some people use GUI based 
>>>> sql editors to manipulate the database; I don't so I can't help you there. 
>>>> In terms of sqlite3 the following commands can be used to delete 
>>>> archive table records:
>>>>
>>>> $ sqlite3 /var/lib/weewx/weewx.sdb
>>>> sqlite> DELETE FROM archive WHERE dateTime=1741999500;
>>>> sqlite3> .q
>>>> $
>>>>
>>>> In this case we deleted the archive record timestamped 1741999500. 
>>>>
>>>> To delete a range of archive records by timestamp use something like:
>>>>
>>>> sqlite> DELETE FROM archive WHERE dateTime>=1741969500 AND 
>>>> dateTime<=1741999500;
>>>>
>>>> which would delete all archive records timestamped from 1741969500 
>>>> to 1741999500 inclusive.
>>>>
>>>> Note that you may need to install the sqlite3 utility on your system 
>>>> using something like (for debian based systems):
>>>>
>>>> $ sudo apt install sqlite3
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "weewx-user" group.
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/weewx-user/1KX3-8lbU34/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> weewx-user+...@googlegroups.com.
>>>>
>>> To view this discussion visit 
>>>> https://groups.google.com/d/msgid/weewx-user/4506f570-fb2c-40f2-95ab-a98c02a59011n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/weewx-user/4506f570-fb2c-40f2-95ab-a98c02a59011n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "weewx-user" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/weewx-user/1KX3-8lbU34/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> weewx-user+...@googlegroups.com.
>>
> To view this discussion visit 
>> https://groups.google.com/d/msgid/weewx-user/6b7b89a9-5703-4ae9-8c01-bdaa310ff7can%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/weewx-user/6b7b89a9-5703-4ae9-8c01-bdaa310ff7can%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to weewx-user+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/weewx-user/9a43dbf7-5658-49ef-819f-b00a1d5ba510n%40googlegroups.com.

Re: [weewx-user] Re: weeimport a smattering of missing records

Reply via email to