Voting resources, early voting, and poll worker information - VOTE. ... Adafruit is open and shipping.
0

Easy way to checksum/hash a file from REPL?
Moderators: adafruit_support_bill, adafruit

Please be positive and constructive with your questions and comments.

Easy way to checksum/hash a file from REPL?

by kevinjwalters on Sat May 23, 2020 8:32 pm

Two or three times (i.e probably once a year) I've noticed I have copied files using Windows explorer to a board and it's given an error on a very early line of code, possibly with this message:

Code: Select all | TOGGLE FULL SIZE
Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
Traceback (most recent call last):
  File "code.py", line 1
SyntaxError: invalid syntax


That jumps out as strange because the first few lines are often just comments and imports and unlikely to be incorrect unless there's been a file editing blunder.

The last time it happened I knew the code worked because it was working on another identical board. I did a SHA-256 from powershell and the source file and the one on the board were the same. When I recopied the file it worked. I didn't think about it at the time but I should have checked from the board's point of view too and not just the host o/s.

Is there an easy way from REPL of using CircuitPython to do a checksum or hash of a file on the CIRCUITPY drive?

kevinjwalters
 
Posts: 729
Joined: Sun Oct 01, 2017 3:15 pm

Re: Easy way to checksum/hash a file from REPL?

by tannewt on Mon May 25, 2020 4:46 pm

I don't know of a way to do a checksum or hash from CP.

This kind of SyntaxError can happen when the host OS hasn't actually written the full file back to CP. Windows is particularly bad about this. You can poke it by making it "safe to remove" which should hopefully cause a flush.

tannewt
 
Posts: 1781
Joined: Thu Oct 06, 2016 8:48 pm

Re: Easy way to checksum/hash a file from REPL?

by kevinjwalters on Mon Jun 01, 2020 11:39 am

I have just copied a file to three boards and only two picked it up. From the Windows host I have:


Code: Select all | TOGGLE FULL SIZE
PS C:\> Get-FileHash E:\code.py -Algorithm SHA256

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          64AE73DFC4CCF42D86C94118A86E6932E197459F7DACE1D00EDAF970F1EE1492       E:\code.py

PS C:\> Get-FileHash F:\code.py -Algorithm SHA256

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          64AE73DFC4CCF42D86C94118A86E6932E197459F7DACE1D00EDAF970F1EE1492       F:\code.py

PS C:\> Get-FileHash G:\code.py -Algorithm SHA256

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          64AE73DFC4CCF42D86C94118A86E6932E197459F7DACE1D00EDAF970F1EE1492       G:\code.py


But I get a repeatable syntax error on one of the boards. I've just improvised the world's simplest checksum.

GOOD:

Code: Select all | TOGGLE FULL SIZE
>>> with open("code.py", "rb") as file_bin:
...     filebindata = file_bin.read()
...
...
...
>>> total = 0
>>> for b in filebindata: total += b
...
>>> total
1891982
>>> len(filebindata)
24247


BAD (NB: total is different by two digits):

Code: Select all | TOGGLE FULL SIZE
>>> with open("code.py", "rb") as file_bin:
...     filebindata = file_bin.read()
...
...
...
>>> total = 0
>>> for b in filebindata: total += b
...
>>> total
1893582
>>> len(filebindata)
24247


Something has mashed up at least one line, 613 is the syntax error:

GOOD (this is what it looks like on the host):

Code: Select all | TOGGLE FULL SIZE
>>> file_byline[613-1]
'                          "{:d} and RpsKeyDataAdvertisement {:d}".format(len(cipher_ads), len(key_ads)))\n'
>>> file_byline[614-1]
'            except KeyError:\n'
>>> file_byline[615-1]
'                pass\n'
>>> file_byline[616-1]
'            player_choices.append(opponent_choice)\n'



BAD:

Code: Select all | TOGGLE FULL SIZE
>>> file_byline[613-1]
'                          "{:d} and RpsKeyDataAdvertisement {:d}".format(len(cipher_ads), len(key_     player_choices.append(opponent_choice)\n'
>>> file_byline[614-1]
'        pass\n'
>>> file_byline[615-1]
'       p wins ayer_choices.append(opponent_choice)\n'
>>> file_byline[616-1]
'\n'


ADDED LATER: I pulled the file off the device by cutting and pasting the bytes printed on serial console and re-creating the corrupted code.py on another machine. Here's the diff, the area of damage is a very small portion.

Code: Select all | TOGGLE FULL SIZE
$ diff -c actual-code.py corrupted-code.py
*** actual-code.py      2020-06-01 15:58:06.804888912 +0100
--- corrupted-code.py   2020-06-01 16:49:41.767492516 +0100
***************
*** 610,619 ****
                                round, round_msg1, round_msg2)
                  else:
                      print("Wrong number of RpsEncDataAdvertisement "
!                           "{:d} and RpsKeyDataAdvertisement {:d}".format(len(cipher_ads), len(key_ads)))
!             except KeyError:
!                 pass
!             player_choices.append(opponent_choice)

          ### Chalk up wins and losses
          for p_idx1, player in enumerate(players[1:], 1):
--- 610,618 ----
                                round, round_msg1, round_msg2)
                  else:
                      print("Wrong number of RpsEncDataAdvertisement "
!                           "{:d} and RpsKeyDataAdvertisement {:d}".format(len(cipher_ads), len(key_     player_choices.append(opponent_choice)
!         pass
!        p wins ayer_choices.append(opponent_choice)

          ### Chalk up wins and losses
          for p_idx1, player in enumerate(players[1:], 1):


It looks like there's one damaged chunk starting on a 128 byte boundary (could be finer grain than that):

Code: Select all | TOGGLE FULL SIZE
$ for skip in {182..184}
> do
>   echo ACTUAL ${skip}
>   dd if=actual-code.py bs=128 skip=${skip} count=1 2> /dev/null ; echo
>   echo CORRUPTED ${skip}
>   dd if=corrupted-code.py bs=128 skip=${skip} count=1 2> /dev/null ; echo
>   echo --------
> done
ACTUAL 182
 of RpsEncDataAdvertisement "
                          "{:d} and RpsKeyDataAdvertisement {:d}".format(len(cipher_ads), len(key_
CORRUPTED 182
 of RpsEncDataAdvertisement "
                          "{:d} and RpsKeyDataAdvertisement {:d}".format(len(cipher_ads), len(key_
--------
ACTUAL 183
ads)))
            except KeyError:
                pass
            player_choices.append(opponent_choice)

        ### Chalk u
CORRUPTED 183
     player_choices.append(opponent_choice)
        pass
       p wins ayer_choices.append(opponent_choice)

        ### Chalk u
--------
ACTUAL 184
p wins and losses
        for p_idx1, player in enumerate(players[1:], 1):
            (win, draw, void) = evaluateGame(my_choic
CORRUPTED 184
p wins and losses
        for p_idx1, player in enumerate(players[1:], 1):
            (win, draw, void) = evaluateGame(my_choic
--------

kevinjwalters
 
Posts: 729
Joined: Sun Oct 01, 2017 3:15 pm

Re: Easy way to checksum/hash a file from REPL?

by kevinjwalters on Mon Jun 01, 2020 6:28 pm

Here's a comparison of the broken part of the file showing how the real file on the left maps to CircuitPython's view of code.py on CIRCUITPY:

actual-vs-corrupted-mapping-bs8-23408-23560.png
Comparison mapping 8 byte chunks with same content between actual vs corrupt file
actual-vs-corrupted-mapping-bs8-23408-23560.png (253.49 KiB) Viewed 40 times

kevinjwalters
 
Posts: 729
Joined: Sun Oct 01, 2017 3:15 pm

Re: Easy way to checksum/hash a file from REPL?

by kevinjwalters on Mon Jun 01, 2020 6:46 pm

And to complete the checks I just tried eject, then reset button, then power cycle with usb disconnect. I checked the file each time by totalling the bytes and it remains unchanged in its corrupted state. chkdsk from the host says everything is clean.

kevinjwalters
 
Posts: 729
Joined: Sun Oct 01, 2017 3:15 pm

Re: Easy way to checksum/hash a file from REPL?

by kevinjwalters on Thu Sep 03, 2020 11:15 am

File Save is Corrupting code.py sounds like same issue but on a later version of Windows and way more frequent than mine.

kevinjwalters
 
Posts: 729
Joined: Sun Oct 01, 2017 3:15 pm

Please be positive and constructive with your questions and comments.