• 0 Posts
  • 60 Comments
Joined 1 year ago
cake
Cake day: June 22nd, 2023

help-circle

  • What you describe is true for many file formats, but for most lossy compression systems the “standard” basically only strictly explains how to decode the data and any encoder that produces output that successfully decodes that way is fine.

    And the standard defines a collection of “tools” that the encoders can use and how exactly to use, combine and tweak those tools is up to the encoder.

    And over time new/better combinations of these tools are found for specific scenarios. That’s how different encoders of the same codec can produce very different output.

    As a simple example, almost all video codecs by default describe each frame relative to the previous one (I.e. it describes which parts moved and what new content appeared). There is of course also the option to send a completely new frame, which usually takes up more space. But when one scene cuts to another, then sending a new frame can be much better. A “bad” codec might not have “new scene” detection and still try to “explain the difference” to the previous scene, which can easily take up more space than just sending the entire new frame.





  • Note that just because everything is digital doesn’t mean something like that isn’t necessary: If you depend on your service provider to keep all of your records then you will be out of luck once they … stop liking you, go out of business, have a technical malfunction, decide they no longer want to keep any records older than X years, …

    So even in a all-digital world I’d still keep all the PDF artifacts in something like that.

    And I also second the suggestion of paperless-ngx (even though I’m not using it for very long yet, but it’s working great so far).


  • Ask yourself what your “job” in the homelab should be: do you want to manage what apps are available or do you want to be a DB admin? Because if you are sharing DB-containers between multiple applications, then you’ve basically signed up to checking the release notes of each release of each involved app closely to check for changes like this.

    Treating “immich+postgres+redis+…” as a single unit that you deploy and upgrade together makes everything simpler at the (probably small) cost of requiring some more resources. But even on a 4GB-ram RPi that’s unlikely to become the primary issue soon.


  • There’s many different ways with different performance tradeoffs. for example for my Homeland server I’ve set it up that I have to enter it every boot, which isn’t often. But I’ve also set it up to run a ssh server so I can enter it remotely.

    On my work laptop I simply have to enter it on each boot, but it mostly just goes into suspend.

    One could also have the key on a usb stick (or better use a yubikey) and unplug that whenever is reasonable.


  • Just FYI: the often-cited NIST-800 standard no longer recommends/requires more than a single pass of a fixed pattern to clear magnetic media. See https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-88r1.pdf for the full text. In Appendix A “Guidelines for Media Sanitation” it states:

    Overwrite media by using organizationally approved software and perform verification on the
    overwritten data. The Clear pattern should be at least a single write pass with a fixed data value,
    such as all zeros. Multiple write passes or more complex values may optionally be used.

    This is the standard that pretty much birthed the “multiple passes” idea, but modern HDD technology has made that essentially unnecessary (unless you are combating nation-state-sponsored attackers, in which case you should be physically destroying anything anyway, preferably using some high-heat method).



  • That saying also means something else (and imo more important): RAID doesn’t protect against accidental or malicious deletion/modification. It only protects against data loss due to hardware fault.

    If you delete stuff or overwrite it then RAID will dutifully duplicate/mirror/parity-check that action, but doesn’t let you go back in time.

    Thats the same reason why just syncing the data automatically to another target also isn’t the same as a full backup.





  • You don’t need a dedicated git server if you just want a simple place to store git. Simply place a git repository on your server and use ssh://yourserver/path/to/repo as the remote URL and you can push/pull.

    If you want more than that (i.e. a nice Web UI and user management, issue tracking, …) then Gitea is a common solution, but you can even run Gitlab itself locally.


  • “Use vim in SSH” is not a great answer to asking for a convenient way to edit a single file, because it requires understanding multiple somewhat-complex pieces of technology that OP might not be familiar with and have a reasonably steep learning curve.

    But I’d still like to explain why it pops up so much. And the short version is very simple: versatility.

    Once you’ve learned how to SSH into your server you can do a lot more than just edit a file. You can download files with curl directly to your server, you can move around files, copy them, install new software, set up an entire new docker container, update the system, reboot the system and many more things.

    So while there’s definitely easier-to-use solutions to the one singular task of editing a specific file on the server, the “learn to SSH and use a shell” approach opens up a lot more options in the future.

    So if in 5 weeks you need to reboot the machine, but your web-based-file-editing tool doesn’t support that option, you’ll have to search for a new solution. But if you had learned how to use the shell then a simple “how do I reboot linux from the shell” search will be all that you need.

    Also: while many people like using vim, for a beginner in text based remote management I’d recommend something simpler like nano.



  • ZFS combines the features of something like LVM (i.e. spanning multiple devices, caching, redundancy, …) with the functions of a traditional filesystem (think ext4 or similar).

    Due to that combination it can tightly integrate the two systems and not treat the “block level” as an opaque layer. For example each data block in ZFS is stored with a checksum, so data corruption can be detected. If a block is stored on multiple devices (due to a mirroring setup or raid-z) then the filesystem layer will read multiple blocks when it detects such a data corruption and re-store the “correct” version to repair the damage.

    First off most filesystems (unfortunately and almost surprisingly) don’t do that kind of checksum for their data: when the HDD returns rubbish they tend to not detect the corruption (unless the corruption is in their metadata in which case they often fail badly via a crash).

    Second: if the duplication was handled via something like LVM it couldn’t automatically repair errors in a mirror setup because LVM would have no idea which of the blocks is uncorrupted (if any).

    ZFS has many other useful (and some arcane) features, but that’s the most important one related to its block-layer “LVM replacement”.



  • I’m torn a bit, because architecturally/conceptually the split that LVM does is the correct way: have a generic layer that can bundle multiple block devices to look like one and let any old filesystem work on top of that. It’s neat, it’s clean, it’s unix-y.

    But then I see what ZFS (and btrfs, but I don’t use that personally) do while “breaking” that neat separation and it’s truly impressive. Sometimes tight integration between layers has serious advantages too and neat abstraction layers don’t work quite as well.