when Konsole terminal exited while nwipe is sitting at the
drive selection screen.
To avoid 100% CPU usage, check for a runaway condition caused by the
function "keystroke = getch() that immediately returns an error
condition. We check for an error condition because getch() returns a
ERR value when the timeout value "timeout( 250 );" expires as well as
when a real error occurs. We can't differentiate from normal operation
and a failure of the getch function to block for the specified period
of timeout. So here we check the while loop hasn't exceeded the number
of expected iterations per second ie. a timeout(250) block value of
250ms means we should not see any more than (1000/250) = 4 iterations.
We double this to 8 to allow a little tolerance. Why is this necessary?
It's been found that in KDE konsole and other terminals based on the QT
terminal engine exiting the terminal without first existing nwipe
results in nwipe remaining running but detached from any interface
which causes getch to fail and its associated timeout. So the CPU or
CPU core rises to 100%. Here we detect that failure and exit nwipe
gracefully with the appropriate error. This does not affect use of
tmux for attaching or detaching from a running nwipe session when
sitting at the selection screen. All other terminals correctly
terminate nwipe when the terminal itself is exited.
This fixes a problem where the drive temperature is not updated
automatically in the drive selection window only. The temperature
is however updated every correctly every 60 seconds during a wipe in
the wipe status window.
This bug would probably never be noticed by most people as usually the
drive temperature changes slowly and only rises once a wipe has started.
The only time I imagine it would have been noticed would have been if
the drive temperature was already high and you were trying to reduce the
temperature by cooling before starting a wipe.
This has now been corrected so that the temperature in the drive
selection window is updated every 60 seconds.
This patch fixes a minor display issue that occurs when
a user aborts a wipe before a wipe has started. It only occurs
if the user had selected one or more drives for wipe and then
aborted before starting the wipe. The spurious message only
occurs in a virtual terminal, i.e. /dev/tty1, /dev/tty2, /dev/console
It does not occur in terminal applications such as konsole, xterm,
terminator etc.
The spurious message that appears in the main window, states that
"/dev/sdxyz 100% complete" along with garbage values in the statistics
window. The message appears for a fraction of a second before being
replaced with the textual log information that correctly states that
the user aborted and no wipe was started.
Basically the gui status information update function tries to update
the data when the wipe hasn't even started. The fix is to only update
the statistics information only if a wipe has started by checking the
'global_wipe_status' value which indicates whether any wipe started. '1'
indicates that a wipe has started, else '0' if no wipe has started.
If the drive becomes non responsive during the wipe, the MB/s will
slowly drop towards 0MB/s and will display a FAILURE -1 error. The logs
will display errors and nwipe's return status will be non zero, however
the summary table may display erased rather than FAILURE, this is because
the wipe thread exited prematurely without setting the pass error.
This fixes the error by checking the context's result status, i.e non zero
on failure and if pass equals zero it makes pass equal to one. This is
then picked up by the summary table log code which then marks the status
correctly as FAILURE in the summary table.
https://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html
"... random passes before and after the erase process, and by performing the deterministic passes in random order ..."
"The deterministic patterns between the random writes are permuted before the write is performed, ..."
Each ISAAC call generates a block of integers, but only the first integer was used before the PRNG was called again.
This resulted in most of the random numbers being wasted and more calls to the PRNG than was necessary.
Also fixed some segmentation faults in ISAAC-64 code.
This was caused by inconsistent labeling of the
serial number by smartctl. In a majority of cases
smartctl would output serial number data using
the label "Serial Number:" however on a SAS
drive we found that smartctl output the label
as "Serial number:" i.e. differs with a lower
case 'n'. Unfortunately we were doing a case
sensitive search for "Serial Number" so the result
being the serial number was not found.
This patch converts both strings in the search
to lower case before searching.
In addition a new field was added to the
anonymize list, "logical unit id:" so when the
-q option is used "serial number", "lu wwn device id:"
and "logical unit id:" are all now anonymized in the
smartctl debug data.
If pthread_join failed, nwipe would report an error
but would also report that the thread had been
cancelled. This has been corrected so that only
the error message is displayed in a fault condition.
Particularly relevant to Debian which when logged in as root
doesn't put /sbin in the $PATH environment setting.
This might have caused temperatures to not be available on Debian
systems, but not necessarily on distros based on Debian like
Ubuntu which would have worked ok.
The format specifier didn't match the size of the variable that holds
c2[i]->device_sector_size which is an int. This didn't appear to cause
a problem with reporting in a 64 bit build. It does cause a problem in
a 32 bit build displaying a very large & incorrect number for sector,
block and device sizes.
This doesn't cause any issues with the overall function, simply
incorrect sector, block and device sizes in 32 bit builds as displayed
in the nwipe log.
I also changed the signed long long for c2[i]->device _size to a
unsigned long long as there is little point in a negative device size.
Add /sys/class/hwmon/hwmonX/device/ to the list
of directories to search for a given device.
Some nvme devices/controllers put the device name
in /sys/class/hwmon/hwmonX/device/ rather than
/sys/class/hwmon/hwmonX/device/nvme/nvme0/
Update debug messaging.
For sdX devices we look in
/sys/class/hwmon/hwmonX/block for the device
name.
However, for nvme we access the device name
by looking in ..
/sys/class/hwmon/hwmonX/device/nvme/nvme0
nvme0n1
Multiple nvme drives/controllers will need
to be tested.
In the drive selection window when you
select a drive, the drive is identified
as selected for wiping with the [wipe]
label, however if you then select a
verify only method such as 'verify with
ones' or 'verify with zeros' it still
says [wipe] which is technically a
contradiction.
This patch changes the [wipe] to a
[vrfy] when a verify only method is
selected. If a method is selected
that writes data to the disc then the
label is displayed as [wipe].
If a faulty drive fails mid wipe and it's
throughput drops until eventually reaching
zero. The ETA calculation grows to an enormously
high value.
This patch prevents the ETA being calculated if
the throughput for a given drive drops below
100,000 bytes per second. In this way we can still
see that something is wrong because the ETA is much
higher than normal but prevents the sort of calculation
that looks like this ! 90867213:29:12 i.e ..
90,867,213 hours, 29 minutes and 12 seconds, or put
another way, 3,786,133 days or 10,372 years.
When many verification or pass errors are detected
the status line can wrap on a 80 column display.
This patch makes the error message more succinct
which will free up about 10 characters & prevents
the line wrapping.
If you used control-c during a wipe,
the summary table would report FAILED
rather than UABORTED, even though no
errors had occured and no errors were
reported in the error summary table.
The correct message on control-c abort
should be UABORTED if no errors on that
drive had so far been detected or FAILED
if errors had been detected.
nwipe reported correctly when letting
the wipe continue to it's natural completion.
This patch fixes that issue.
Anonymize the serial numbers in the gui, the
log and the summary table.
If a serial number was obtained from the device,
it is replaced with "XXXXXXXXXXXXXXX".
If the serial number could not be obtained from the
device, it's replaced with "???????????????".
This fixes column alignment issues in the gui
with nvme drives i.e. nvme0n1 etc. If the drive
name including path exceeds 8 characters the
/dev/ is removed and prefixed with spaces to
a total max length of 8 characters.
In both selection and wipe status windows I
moved the temperature readout to the right
of the disk size column.
Also in the wipe status window, by moving to
the right of the disk size column the
temperature is no longer on the same line as
the disk pass progress. This reduces the line
length as I found that in tests if you had
many verification errors the temperature
being on the end of the line would disappear
of the right of the screen.
Also removed 1. 2. etc from drive selection
to reduce the line length. Also removed the
space between > and [, ie "> [wipe]" becomes
">[wipe]" These changes remove 3 characters
and help to reduce the affect of the additional
temperature field [30C] which add 5 characters.
Therefore the line length overall, increased by
5-3 = 2 characters.
This helps to reduce line wrapping on 80
character terminals, when the drive model length
exceeds 24 characters.
1. Changed a few nwipe log messages to improve
readability.
2. Added code so that the temperature changes from
- white text = (Temperature within spec)
- red text = max continuous temp reached
- red flashing text = critical upper temperature reached
- black text = minimum continuous temperature reached
- black flashing text = critical lower temperature reached.
1. Changed gui format specifier to match signed
integer that we changed to in the previous commit.
2. Changed the format specifier in verbose nwipe logging of
the temperatures and moved the nwipe log message that prints
the hwmonX path to the log in order to remove
unnecessary printing to the log.
1. Changed u64 to int to support negative
temperature values, i.e. temp1_lcrit may
be as low as -40C
2. In gui surround temperature reading with
square brackets instead of round brackets to
be consistent with the rest of the drive status line
Stage 2 modifies the GUI to trigger a temperature
update every 60 seconds. Changes were made to the
drive progress line to include [ 30C ]. The drive
context structure had another variable added that
records the time of the temperature update.
Note. When the temperature data cannot be retrieved
from the hwmon (drivetemp) module the GUI displays
[ --C ]. USB devices, even those adapters that support
ATA pass through, don't seem to work with hwmon (drivetemp),
at least the adapters I have don't work with drivetemp
to monitor temperature.
Stage 1 adds the additional variables to the drive
context and creates the temperature initialisation
function, which associates a hwmonX directory with
a block device. Also wrote the context update function,
that reads hwmon for each drive context and writes the
temperatures back to the context.
Stage 2 commit to follow which will make changes within
the GUI to call the update function every 60 seconds
and display the temperature information.