• 0 Posts
  • 17 Comments
Joined 1 year ago
cake
Cake day: June 19th, 2023

help-circle
  • If by document you mean “any kind of data structure”, then yes, those are documents

    Yep — that is what I mean by documents, and it’s what I meant all along. The beauty of documents is how simple and flexible they are. Here’s a URL (or path), and here’s the contents of that URL. Done.

    But then the term becomes meaningless, as literally anything is a document.

    No, because you can’t store “literally anything” in a Postgres database. You can only store data that matches the structure of the database. And the structure is also limited, it has to be carefully designed or else it will fall over (e.g. if you put an index on this column, inserts will be too slow, if you don’t have an index on that column selects will be too slow, if you join these two tables the server will run out of memory, if you store these columns redundantly to avoid a join the server will run out of disk space…)

    Sure, but then finding that document takes 5 minutes

    Sure - you can absolutely screw up and design a system where you need to read millions of files to find the one you’re looking for, but because it’s so flexible you should be able to design something efficient easily.

    I’m definitely not saying documents should be used for everything. But I am saying they should be used a lot more than they are now. It’s so easy to just write up the schema for a few tables and columns, hit migrate, and presto! You’ve got a data storage system that works well. Often it stops working well a year later when users have spent every day filling it with data.

    What I’m advocating is stop, and think, should this be in a relational database or would a document work better? A document is always more work in the short term, you need to carefully design every step of the process… but in the long term it’s often less work.

    Almost everything I work with these days is a hybrid - with a combination of relational and document storage. And often the data started in the relational database and had to be moved out because we couldn’t figure out how to make it performant with large data sets.

    Also, I’m leaning more and more into using sqlite, with multiple relational databases instead of just a single database. Often I’m treating that database as a document. And I’m not alone, Sqlite is very widely used. Document storage is very widely used. They’re popular because they work and if you are never using them, then I’d suggest you’re probably compromising the quality of your software.


  • I’m 99% certain this is wrong

    ? This is how Postgres stores data, as documents, on the local filesystem:

    There are hundreds, even thousands, of documents in a typical Postgres database. And similar for other databases.

    But anyway, the other side of the issue is more problematic. Converting relational data to, for example, a HTTP response.

    Persisting data as documents would be atrocious for performance.

    Yep… it’s pretty easy to write a query on a moderately large database that returns 1kb of data and takes five minutes to execute. You won’t have that issue if your 1kb is a simple file on disk. It’ll read in a millisecond.


  • The article you linked disagrees - they said it pretty well:

    Of course, some issues come from the fact that people are trying to use the Relational model where it doesn’t suit their use case. That’s why I prefer a document model instead of a tabular one as the default choice. Most of our applications are more suitable for it, as we’re still moving the regular physical world (so documents) into computers. (Read also more in General strategy for migrating relational data to document-based).

    I never joined the NOSQL hype-train so I can’t comment on that. However I will point out storing documents on a disk is a very well established and proven approach… and it’s even how relational databases work under the hood. They generally persist data on the filesystem as documents.

    Where I find relational data really falls over is at the conversion point between relational document representation. That typically happens multiple times in a single operation - for example when I hit the reply button on this comment (I assume, haven’t read the source code) this is what will happen:

    1. my reply will be sent to the server as a document, in body of a HTTP request
    2. beehaw’s server will convert that document into relational data (with a considerable performance penalty and large surface are for bugs)
    3. PostgreSQL is going to convert that relational data back into a document format and write it to the filesystem (more performance issues, more opportunities for bugs)

    And every time the comment is loaded (or sent to other servers in the fediverse) that silly “document to relational to document” translation process is repeated over and over and over.

    I’d argue it’s better, more efficient, to just store this comment as a document because over and over and over it’s going to be needed in that format and anyway you ultimately need to write it to disk as a document.

    Yes - you should also have a relational index containing critical metadata in the document. The relationship linking that document to the comment that I replied to. The number of upvotes it has received. Etc Etc… but that should be a secondary database, not the primary one. Things like an individual upvote should also be a document, stored as a file on disk (in the format specified by AcitivtyStreams 2.0).



  • For me the article touches on the problem but doesn’t actually reveal it.

    What I see day in and day out is projects using a relational database to store data that is not suited to a relational database. And you can often get away with that fundamental mistake when you’re writing raw SQL queries… but as soon as an ORM is involved you’re in for a world of pain (or at least, problems with performance).



  • WASM allows arbitrary code execution in an environment that doesn’t include the DOM… however it can communicate with the page where the DOM is available, and it’s trivial to setup an abstraction layer that gives you the full suite of DOM manipulation tools in your WASM space. Libraries for WASM development generally provide that for you.

    For example here’s SwiftWASM:

    let document = JSObject.global.document
    
    var divElement = document.createElement("div")
    divElement.innerText = "Hello, world"
    _ = document.body.appendChild(divElement)
    

    It’s pretty much exactly the same as JavaScript, except you nee to use JSObject to access the document class (Swift can do globals, but they are generally avoided) and swift also presents a compiler warning if you execute a function (like appendChild) without doing anything with the result. Assigning it to a dummy “underscore” variable is the operator in Swift to tell the compiler you don’t want the output.



  • I don’t even know what Turbo 8 is

    Maybe you should find out?

    The idea behind Turbo is your server sends HTML/CSS to the client, and when the content needs to be updated… the server simply sends new HTML which Turbo will inject into the page. You can also annotate links so they fetch new content from the server instead of navigating to a new URL.

    Your server side code can be written in whatever language you prefer… Turbo being a 37Signals project I assume they’re using Ruby. It’d work fine with TypeScript too if that’s your thing. Turbo just uses HTTP / JSON to talk to the server and doesn’t have a server side component.

    You can have client side code, but AFAIK there’s pretty minimal interaction with Turbo - you might for example add an event listener that processes the HTML and as converts ISO date/times into Date.toLocaleString().

    If you’re writing complex client side code then you shouldn’t be using Turbo at all.

    This change doesn’t affect, at all, the language used by users of Turbo. What’s changed is the Turbo dev team themselves have chosen to write Turbo in vanilla javascript. And there are advantages to vanilla JS - it removes the compilation step from one language to another, for example.


  • On some unix systems (MacOS for example) you can’t even do that with root.

    You’d need reboot into firmware, change some flags on the boot partition, and then reboot back into the regular operating system.

    To install a new version of the operating system on a Mac, it creates a new snapshot of your boot hard drive, updates the system there, then reboots instructing the firmware to reboot on the new snapshot. The firmware does it’s a few checks of it’s own as well, and if it fails to boot then it will reboot on the old snapshot (which is only removed after successfully booting on to the new one). That’s not only a better/more reliable way to upgrade the operating system, it’s also the only way it can be done because even the kernel doesn’t have write access to those files.

    The only drawback is you can’t use your computer while the firmware checks/boots the updated system. But Apple seems to be laying the foundations for a new process where your updated operating system will boot alongside the old version (with hypervisors) in the background, be fully tested/etc, and then it should be able to switch over to the other operating system pretty much instantly. It would likely even replace the windows of running software with a screenshot, then instruct the software to save it’s state and relaunch to restore functionality to the screenshot windows (they already do this if a Mac’s battery runs really low - closing everything cleanly before power cuts out, then restore everything once you charge the battery).


  • When I last used a computer that had a single mode (about 20 years ago), I was in the habit of saving my work about every 15 seconds and manually backing up my documents (to an offline backup that wasn’t physically connected to the computer) multiple times per day.

    That’s how often the computer crashed. I never had a virus in those days, it was always innocent and unintentional software bugs which would cause your computer to need a reboot regularly and occasionally delete all of your files.

    Trust me, things are better now. I still save regularly and maintain backups, but I do it a lot less religiously than I used to, because I’ve lost my work just once in the last several years. It used to be far more often.






  • Create a folder, put markdown files in it, sync* and backup* the folder however you like and edit the files with whatever you like*.

    Within my folder I have a daily journal - start each day with a list of what I hope to achieve today and make notes throughout the day as I progress on those tasks. The next day that journal becomes something I’lll refer back to in the morning to decide what to do next. Depending on the project - weekly or monthly might be more suitable than daily. Or maybe something else entirely.

    I also have folders an files for longer term tasks.

    If you want to collaborate, make a second folder and choose a sync platform you can all agree on.

    (* I use GitHub for Sync, Backblaze B2 for backup, and Visual Studio Code for editing, with extensions for markdown and making GitHub a little easier… specifically GitDoc for auto-commit/push/pull and Markdown All in One for formatting/etc. Also Copilot is handy for some note taking tasks. The “foam” extension mentioned here looks like it might be great too)


  • I use various extensions for Visual Studio Code. They add a million features, but these are the ones I find most useful:

    I prefer to view the current status of my checkout in the sidebar of my code editor than on the command line.

    It’s easier to view a diff of a file and decide whether to stage or rollback changes in a GUI. With most GUIs you can even select individual lines of code and revert or stage them.

    I like how Commit and Push and Pull are a single “Commit & Sync” button in Visual Studio code. Similarly there’s a simple “Sync” button in the status bar.

    Speaking of the status bar - it also has a counter for commits that need to be pushed or pulled. And it tells you what branch you’re currently on. And whether you have uncommitted changes. Handy.

    I find the GUI equivalent of git log --graph is significantly easier to understand when the graph is drawn with nice vector lines instead of ASCII art.

    Finally - I don’t just use raw git, I also use extensions like pull requests, and I create branches for issue numbers. I have an extension that shows pull requests in Visual Studio Code and also shows issues assigned to me, with a one click “Start Working” button to create a branch named after the issue and change the issue status to In Progress. And when I’m finished working on it, there’s a button for that too.