Feature - Rejected Better file corruption support

Irrat

Member
A corrupt file sometimes has long lines of 0 bytes, caused by the writer being interrupted as the data is being written.
When half the settings are written properly, the current check will not detect corruption.

But it may be possible for settings to contain null bytes. I'm getting a headache thinking about this, if someone wants to take a shot at it.

Basically, I was thinking as a basic level of support, 10 null bytes in a row = corrupt file.

Also, possibly on top of the existing .bak, integrate the "parse session and restore preferences from it" from kolfix into mafia proper.
I think it'll only work if the preferences are logged, but better than nothing. It'd be nice to have this automatically handled.

Another thought is to check if some preferences are missing. If a hardcoded preference is missing, it's assumed corrupted. Maybe with a null bytes check.
 
Well at least you have socialized the idea. I look forward to tests and code :-)

Perhaps because of a background writing commercial software where someone else decided the best use of my time, I would not do anything to solve this. I would, however, put my effort into figuring out why it happens and fix it.

Give me a repeatable test and I'll be glad to work on it.

Note that https://github.com/kolmafia/kolmafia/pull/1889 seemed to eliminate some (all?) of the reported corruption at the cost of a thread lock which we never were able to resolve. Hence closed, but not committed.

veracity wrote a script that looks at KoLmafia preferences that are not in defaults - i.e. some script created and maintained the preference. zarqon has a relay script that will identify KoLmafia properties that don't have a default or that are changed from a known default. Both of those are helpful for identifying values that might need some kind of manual intervention is a file is being "repaired".

KoLmafia has a lot of code to validate what it thinks the state of the world is compared to KoL's vision. So deleting a KoLmafia maintained preference will often reset it properly and if that code does not detect corruption then perhaps it should be extended to do so.

Since you seem to be one of the more frequent "reporters" of corruption are there unusual conditions you operate under? Hardware? Operating system? Dial up connection? Sharing KoLmafia files across machines or through the cloud? None of these might be relevant but then mafia code is not necessarily state of the art when managing disk I/O with code that must work in various environments.
 
It's more that I'm sick of it being an issue looming over my head. I don't have 24/7 uptime, bots constantly writing to file. It's unlikely, but I don't like it.

I'm thinking that maybe the easiest solution will just to add a property at the end of each file, like #DO NOT DELETE THIS LINE
Then if they delete that line, dump that file as corrupted and restore the backup (assuming it does have the line)

Edit: well, probably check if the file contains that line, not just ends with. I don't trust the user.

Atomic file moves was the thing I thought was going to happen, then I saw some internet posts about how it's actually interfering with people's process, and it reminded me of how unique some people had their mafia setups. Which may have been why the current implementation was made like that. I was also a little concerned demanding it's written to disk immediately, given that there's rapid preference switching.

Anyways, I'll probably make a "line at end of file" pr, with some sanity check to ensure that it wasn't intentional. Might also see about the kolfix integration. Not the entire thing, just the session restore.

Might be best as two pr.
 
Last edited:
What is there about your use case that makes it an "issue looming over your head"?

You say "if they delete the line". Who is they?

I'm not sure how #DO NOT DELETE THIS LINE solves anything except when the corruption is the result of some kind of incomplete write. And it is not clear how deleting the line would harm me. Indeed, since I use text editors to deal with preferences files and parsers that don't recognize # as a comment I can see this breaking things that are currently fixed. This PR would take a lot more to sell me on it :-)

I don't understand the connection between "atomic file moves" and KoLmafia. Moving a file at the operating system level is something KoLmafia doesn't really support.

One way other applications deal with this is to never write to disk until the program exits. That works wonderfully as long as the program never exits except in a deliberate and controlled fashion and the OS is not caching file writes. IIRC saveSettingsOnSet might be controlling this.

A bight more insight into your specific environment/use case might suggest ways to solve the problem rather than detect and fix.

kolfix seems pretty niche to me but mafia changes to support it that have no other impact would not be an issue.

Not trying to be a jerk - asking questions is part of my problem solving toolkit but some people go on the defensive when I do. But I'm trying to solve a problem not create one.
 
What is there about your use case that makes it an "issue looming over your head"?
A bight more insight into your specific environment/use case might suggest ways to solve the problem rather than detect and fix.
Running multiple accounts via scripting without human interaction, they start up automatically and some of the accounts run 24/7.

Sometimes I lose power or the system dies unexpectedly, in my case it's specifically just unexpected shutdown I'm concerned about, where the data did not finish writing but enough data was written that the file is valid.


You say "if they delete the line". Who is they?
People who edit the file manually for various reasons.

I'm not sure how #DO NOT DELETE THIS LINE solves anything except when the corruption is the result of some kind of incomplete write. And it is not clear how deleting the line would harm me. Indeed, since I use text editors to deal with preferences files and parsers that don't recognize # as a comment I can see this breaking things that are currently fixed. This PR would take a lot more to sell me on it :-)
When I say corruption, I mean that the files are in an incomplete state and that they can be parsed to some extent but some of the data in them is lost. The process being killed is the most common factor that I am personally aware of, which may be from something like power failure or system crash. Where the file is being written and is halfway through before being killed prematurely. Sometimes it leaves control codes in the file.

I don't understand the connection between "atomic file moves" and KoLmafia. Moving a file at the operating system level is something KoLmafia doesn't really support.

Java does support it, though not all OS's support it and it'd need to fallback to standard possibly?
Files.move(Paths.get(""), Paths.get(""), StandardCopyOption.ATOMIC_MOVE)

One way other applications deal with this is to never write to disk until the program exits. That works wonderfully as long as the program never exits except in a deliberate and controlled fashion and the OS is not caching file writes. IIRC saveSettingsOnSet might be controlling this.
Yes, the program writing to disk everytime there is a preference change is the reason some people are more likely to see a mangled file (eg, me), but that's a different topic.

kolfix seems pretty niche to me but mafia changes to support it that have no other impact would not be an issue.

Yes, it's only to recover in the case of settings being lost.


But ultimately, I'm going to close this as I ended up just running my own verification before each kolmafia invocation. It's easy enough, if settings file does not contain this line, or if there are null bytes, refuse to startup (and thus enter a data loss state) and send me a notification.
 
I think solving this outside Mafia is better because we've had a whole host of issues trying to fix this inside Mafia.
I think if anything, it'd be nice if mafia made a copy of the backup file if a possible settings corruption is detected. Doesn't do anything blocking, just puts it aside.

Just to avoid the state where you had no idea your settings were corrupted until you loaded it and it wiped your backup file.

But that's an issue that doesn't impact me personally.
 
Yes, that seems like a reasonable idea.

I remember the issue here -- loading corrupt settings and taking a new backup of the corrupt settings -- being the main issue sorting it in Mafia. Although the fact that that can happen in the first place would seem to imply that we try to read the settings before the backup / restore logic runs.
 
Thank you.

My suspicion is that the best solution at the moment is for you to have a process, external to mafia, that checks things before mafia starts and checks things when mafia ends. Failed checks trigger some kind of failure message and perhaps don't start mafia. The monitor might also make sure mafia is active so it never has to timein and might handle rollover explicitly.

Your atomic save example is irrelevant to mafia since mafia does not move files. An atomic write to disk implementation where the file is either unchanged, if an error occurs, or is an exact copy of what was in memory if no errors, is indeed something mafia should be striving for. But getting there from here would probably involve a massive rewrite which may not be something volunteers have time or interest to do. My sense, from looking into this in the past is that the problem is we have preferences that change asynchronously contained in a file that must be written in its entirety and data in memory must be "frozen" for the duration of the write. So it is quite possible that the issue is not due to disk I/O but to memory states that change during I/O.

But I digress...
 
Back
Top