Attached is a simple sqlite query for finding extra database rows.  In it, 
both databases are weewx databases, so if you're trying to find differences 
between weewx and some other database, you'll have to modify the query, but 
the idea remains the same.

Note that the order of the databases is important.  It reports rows that 
are IN db1 but NOT IN db2.  If you swapped the order of the databases, the 
counts would still be correct but it would print no rows because every 
record in the smaller database is in the larger database.

Walt
On Monday, March 17, 2025 at 5:07:39 PM UTC-5 gjr80 wrote:

> If you want to tell what data already exists you will need to get your SQL 
> hands dirty (you could use Cheetah produce a WeeWX report that lists all 
> the timestamps in some given period, but that's an awful lot of messing 
> around for little gain). WeeWX archive record timestamps are saved as an 
> integers, with the DBMS enforcing the unique constraint. So a source record 
> with one second difference to an existing record would be imported.
>
> The import process does report on what records were imported and what 
> records were not via the log; imported records will be logged as being 
> added and records that were rejected due to duplicate timestamps will be 
> reported with a unique key error. Arguably a little crude, but to do any 
> more detailed user friendly analysis and reporting would have to expensive 
> time wise. Of course the reporting is after the fact not before.
>
> Gary
>
> On Tuesday, 18 March 2025 at 02:14:24 UTC+10 [email protected] wrote:
>
>> Thanks gjr80. I guess the question is, how do I tell easily what data 
>> already exists (the time stamps will have to match, I assume). So 
>> "2025-02-28 08:15:00" existing (for example) I assume won't match against 
>> an import of "2025-02-28 08:16:00" (let alone if there's seconds other than 
>> 00). 
>>
>> Is there an "easy" way to tell? or am I stuck going into sqlite3 and 
>> doing some selects (not hard, just was kind of hoping the import process 
>> WOULD report on records already existing). =)
>>
>> On Sunday, March 16, 2025 at 5:59:03 PM UTC-7 gjr80 wrote:
>>
>>> The default weectl import action is to *not* import records where there 
>>> is already an archive record in the database with the same timestamp. 
>>> weectl 
>>> import will appear to import such records but if you look at the WeeWX 
>>> log you will see entries similar to:
>>>
>>> 2023-11-04 15:33:01 weectl-import[3795]: ERROR weewx.manager: Unable to 
>>> add record 2018-09-04 04:20:00 AEST (1535998800) to database 'weewx.sdb': 
>>> UNIQUE 
>>> constraint failed: archive.dateTime
>>>
>>> indicating a record with the same timestamp (in this case 2018-09-04 
>>> 04:20:00 AEST (1535998800)) already exists in the database and the imported 
>>> record was discarded. Long story, but it was too inefficient to check every 
>>> record before attempting to add it to the archive, hence the somewhat 
>>> cumbersome import and notification. Note that if you use the (presently) 
>>> undocumented --update command line option with weectl import 
>>> pre-existing records will be overwritten.
>>>
>>> Bottom line - just run weectl import as per the docs, and whatever you 
>>> do don't include --update on your weectl import command line. And of 
>>> course always make a backup of your database before importing.
>>>
>>> Gary
>>> On Monday, 17 March 2025 at 09:56:17 UTC+10 [email protected] wrote:
>>>
>>>> I lost a few days of data due to a network issue, but I have the data 
>>>> in another system. Am trying to import from csv, but I think my csv 
>>>> overlaps a bit with what's already in the weewx database. 
>>>>
>>>> Doing a dryrun, it says it will import every line in the csv... 
>>>>
>>>> Is there a way to get weectl import to NOT import duplicate data, or is 
>>>> there an easy way to see what data is in those dates so I can modify csv 
>>>> to 
>>>> exclude those entries?
>>>>
>>>> Thanks! 
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/weewx-user/a3b9c953-89ab-4e9b-8c42-45ce0846405en%40googlegroups.com.
This is a copy of the output from my test of the Extra Records query.
My test used identical weewx databases but one of them has all records for 
12/1/2024 missing.



PS C:\Users\bbb\documents\weather> ./sqlite3.exe
SQLite version 3.36.0 2021-06-18 18:36:39
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .read 'Query Extra Records 1.txt'
row count in db1 is 221,788
row count in db2 is 221,692
2024-12-01 00:00:00
2024-12-01 00:15:00
2024-12-01 00:30:00
2024-12-01 00:45:00
2024-12-01 01:00:00
2024-12-01 01:15:00
2024-12-01 01:30:00
2024-12-01 01:45:00
2024-12-01 02:00:00
2024-12-01 02:15:00
2024-12-01 02:30:00
2024-12-01 02:45:00
2024-12-01 03:00:00
2024-12-01 03:15:00
2024-12-01 03:30:00
2024-12-01 03:45:00
2024-12-01 04:00:00
2024-12-01 04:15:00
2024-12-01 04:30:00
2024-12-01 04:45:00
2024-12-01 05:00:00
2024-12-01 05:15:00
....(output truncated:  remaining 12/1/2024 timestamps)...
-- Find rows in db1 that are not in db2

attach 'weewx.sdb' as db1
;
attach 'weewx2.sdb' as db2
;

select 'row count in db1 is ' || printf('%,i', count(*)) 
  from db1.archive;

select 'row count in db2 is ' || printf('%,i', count(*)) 
  from db2.archive;


with t1 as (
    select datetime from db1.archive
    except
    select datetime from db2.archive
)
select datetime(datetime, 'unixepoch', 'localtime') as dt1
  from t1
;

Reply via email to