• 0 Posts
  • 89 Comments
Joined 1 year ago
cake
Cake day: October 4th, 2023

help-circle

  • If you’re talking about that VAE tiling feature or or Tiled Diffusion or whatever it’s called, I think that it shows up in the text below the image in A1111, and I think that anything that shows up there is also stored in the generated image’s comment metadata.

    I don’t normally use Tiled Diffusion, if that’s what you’re referring to, but let me see if I can go generate something with it and check.

    checks

    Yeah, text: Tiled Diffusion: {"Method": "MultiDiffusion", "Tile tile width": 96, "Tile tile height": 96, "Tile Overlap": 48, "Tile batch size": 4}, gets added to the text below the image and to the image metadata.

    That being said, I don’t know how far I’d trust the image metadata for reproducibility if this is a hard requirement you’re looking for. I have definitely seen various settings that mention that they induce non-deterministic behavior, and I’m not sure that all of those are encoded in the metadata. Also, while the version (and looks like git hash of built version) is encoded, I’m sure that not everyone is using the same version, and I don’t know what compatibility is like across versions.

    EDIT: For example, see the “Optimizations” here:

    https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations

    You have a bunch of A1111 command-line optimizations options that have descriptions like:

    --opt-sdp-attention May results in faster speeds than using xFormers on some systems but requires more VRAM. (non-deterministic)

    And those are not encoded in the image metadata, and that’ll make a given output non-reproducible.




  • tal@lemmy.todaytoSelfhosted@lemmy.worldSelfhosted chat service
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    2 months ago

    I have already looked in XMPP, but it required SSL certs and I did not have the mood to configure them.

    There are definitely XMPP clients that do end-to-end encryption that do not rely on TLS for key exchange, though.

    https://en.wikipedia.org/wiki/Off_the_record_messaging

    Off-the-record Messaging (OTR) is a cryptographic protocol that provides encryption for instant messaging conversations. OTR uses a combination of AES symmetric-key algorithm with 128 bits key length, the Diffie–Hellman key exchange with 1536 bits group size, and the SHA-1 hash function. In addition to authentication and encryption, OTR provides forward secrecy and malleable encryption.

    The primary motivation behind the protocol was providing deniable authentication for the conversation participants while keeping conversations confidential, like a private conversation in real life, or off the record in journalism sourcing. This is in contrast with cryptography tools that produce output which can be later used as a verifiable record of the communication event and the identities of the participants. The initial introductory paper was named “Off-the-Record Communication, or, Why Not To Use PGP”.[1]

    I’ve used Pidgin with the libOTR plugin that implements that protocol.





  • tal@lemmy.todaytoSelfhosted@lemmy.worldProgrammatic access to discord
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    2
    ·
    edit-2
    2 months ago

    I get that.

    Honestly, though I’m still a little puzzled as to why people initially got into Discord; I never did.

    I can understand why people wanted to use some systems. Twitter does massive-scale real-time indexing. That was a huge feature, really changed what one could do on the platform.

    Reddit provided a good syntax (Markdown), had a low barrier to entry (no email verification at a time when that was common), and third-party client access. It solved the spam problem that was killing Usenet and permitted for more-reasonable moderation.

    There were a whole host of services that aimed to lower the complexity bar to get a web page and some content online associated with someone’s identity; it was clear that lack of technical knowledge and the technical knowledge required to get stuff up was a real limiting factor for many people.

    But I just didn’t really get where Discord provided much of a win over stuff like IRC. I mean, I guess maybe it bundled a couple services into one, which maybe lowered the bar to use a bit. IRC really seemed pretty fine to me. Reddit bundling image-hosting seems to have lowered the bar, been something that people wanted. Maybe Discord doing images and file-hosting made it more-accessible.

    I have no idea why a number of people who liked Cataclysm: Dark Days Ahead used Discord rather than Reddit; it seemed like a dramatically-worse system if one was aiming to create material for others to look back at and refer to.

    kagis

    https://old.reddit.com/r/RedditForGrownups/comments/t417q1/can_someone_please_explain_discord_to_me_like_im/

    It’s just modern day IRC with video.

    Ahaha, thanks. This is indeed an ELI60 response, although it doesn’t really explain how Discord suddenly got so popular. But if I couple this with /u/Healthy-Car-1860’s response, I’m kind of getting the picture.

    Got popular because it spread through the entire gamer/twitch community like wildfire due to actually being a more complete package and easier to use than anything prior. Online gamers have been struggling with voip software forever (Roger Wilco, Teamspeak, Ventrilo, Skype, and many others).

    Once it was rooted in the people who are on their computers app day every day it was bound to spread because the UX is incredibly easy compared to previous options for both chat and voip.

    Maybe that’s it. I never had a lot of interest in VoIP, especially group VoIP. When I was playing online games much, people used keyboards to communicate, not mics. There was definitely a period where people needed the ability to collaborate in games and games didn’t always provide that functionality. I remember people complaining about Teamspeak and Ventrilo. I briefly poked at Mumble – nice to have an open-source option – but I just had no reason to want to do VoIP with groups of people.

    But I suppose for a video game clan or something, that might be important functionality. And if it’s also a one-stop shop for some other things that you might want to do anyway, it maybe makes sense to just use that rather than multiple services.


  • If I need to do an emergency boot from a USB stick to repair something that can’t boot, which it sounds like is what you’re after, pretty much any Linux distro will do. I’d probably rather have a single, mainstream bootable OS than a handful.

    I’d use Debian, just because that’s what I use normally, so I’m most familiar with it. But it really doesn’t matter all that much.

    And honestly, while having an emergency bootable medium with a functioning system can simplify things, if you’re familiar with the boot process, you very rarely actually need emergency boot media on a Linux system. You have a pretty flexible bootloader in grub, and the Linux kernel can run and be usable enough to fix things on a pretty broken system, if you pass something like init=/bin/sh to the kernel, maybe busybox instead for a really broken system, and can remount root read-write (mount -o rw,remount /) and know how to force syncs (echo s > /proc/sysrq-trigger) and reboots (echo b > /proc/sysrq-trigger).

    I’ve killed ld.so and libc before and broght back systems without alternate boot media. The only time I think you’d likely really get into trouble truly requiring alternate boot media is (a) installing a new kernel that doesn’t work for some reason and removing all the old, working kernels before checking to see that your new one works, or (b) killing grub. Maybe if you hork up your partition table or root filesystem enough that grub can’t bring the kernel up, but in most of those cases, I’m not sure that you’re likely gonna be bringing things back up with rescue tools – you’re probably gonna need to reinstall your OS anyway.

    EDIT: Well, okay, if you wipe the partition table, I guess that you might be able to find the beginning of a filesystem partition based on magic strings or something and either manually reconstruct the partition table or at least extract a copy of the filesystem to somewhere else.


  • CIFS supports leases. That is, hosts will try to ask for exclusive access to a file, so that they can assume that it hasn’t changed.

    IIRC sshfs just doesn’t care much about cache coherency across hosts and just kind of assumes that things haven’t changed underfoot, uses a timer to expire the cache.

    considers

    Honestly, with inotify, it’d probably be possible to make a newer sshfs that does support leases.

    I suspect that the Unixy thing to do is to use NFSv4 which also does cache coherency correctly.

    It is easy to deploy sshfs, though, so I do appreciate why people use it; I do so myself.

    kagis to see if anyone has benchmarks

    https://blog.ja-ke.tech/2019/08/27/nas-performance-sshfs-nfs-smb.html

    Here are some 2019 benchmarks that show NFSv4 to generally be the most-performant.

    The really obnoxious thing about NFSv4, IMHO, is that ssh is pretty trivial to set up, and sshfs just requires a working ssh connection and sshfs software installed, whereas if you want secure NFSv4, you need to set up Kerberos. Setting up Kerberos is a pain. It’s great for large organizations, but for “I have three computers that I want to make talk together”, it’s just overkill.



  • I once worked on a product that you really did not want to have not coming back up. I was on it several years after the original engineers had designed an early model. Said engineers had not tested what happened when the CMOS battery died and triggered a reset of BIOS settings, brought it back to the hardware platform’s default state. When it did, the thing entered a non-bootable state. You could, with a serial port, access the BIOS and fiddle the settings back for one good boot…but the CMOS battery was non-removable, soldered to the motherboard. Our manufacturing process had not involved changing the default BIOS settings, just what was stored in CMOS. Oops.

    IIRC our customer care guys just sent out new models for free to affected customers – the original hardware model wasn’t sold in large volume, and the cost of the actual hardware components wasn’t especially large relative to the cost of the product.

    I had one sitting around on my desk, as it was sometimes handy to have a physically-accessible device to do work on. I rolled down to Radio Shack – yes, this was a few years back – got a removable CMOS battery case, stripped the non-removable battery out, soldered the battery case to the motherboard, and had the only instance of the device out there that could take a fresh CMOS battery.


  • tal@lemmy.todaytoSelfhosted@lemmy.worldAny good linux voice changer?
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    3 months ago

    I haven’t used Piper, but I do want to let you know that it may be a lot easier than you think. I have used TortoiseTTS, and there, you can just fed in a handful (like, four or so) short clips (maybe six seconds, arbitrary speech), and that’s adequate to let it do a reasonable facimile of the voice in the recordings. Like, it doesn’t involve long recording sessions speaking back pre-recorded speech, and you can even combine samples from different people to “mix” their voices. I grabbed a few clean short recordings from video of someone speaking, and that was sufficient. TortoiseTTS doesn’t even retain the model, rebuilds it from scratch from the samples you provided every time it renders voice (which is a testament to how little data it pulls in). It’s not on par with, say, the tremendous amount of work involved in creating a voice for Festival or similar. The “Option B” for Piper on the page I linked to has:

    I have built usable voice models with as few as 50 samples of the target voice.

    …which is more than the tiny handful that I was using on TortoiseTTS, but might open up a lot of options and provide control over what you’re hearing, especially if you have a voice that you really like.

    But, okay. Say you decide that you want to go the post-text-to-speech transform route. Do you have any idea how you want to process them? The most-obvious things I can think of are:

    • Pitch-shifting, like if you want the voice to sound more feminine or masculine.

    • Tempo-shifting, like if you want the voice to speak more-quickly or more-slowly, but without altering the pitch.

    Those are straightforward transforms that people do do on voice recordings; if you want a command-line tool that can do this in a pipeline, sox is a good choice that I’ve used in the past.

    I can imagine that maybe you just want to apply some kind of effect to it (sounding like a robot in an echoy cave? Someone talking over an old radio? Shifting perceptual 3d position in space of the audio source?). There’s a Linux – I’m assuming, given your preference for a CLI, and the community, that this is a Linux environment – audio plugin system called LADSPA and a successor system called LV2. Most Linux audio software, including sox, can run these on audio streams.

    You can maybe do automated removal of silent bits, if there are excessive pauses…sox has silence-removal functionality.

    But most other things that I can think of that one might want to do to a voice, more-sophisticated stuff, like making it sound happy, say, or giving it a different accent or something…I think that it’s going to be a lot harder to do that after the text-to-speech phase rather than before.


  • tal@lemmy.todaytoSelfhosted@lemmy.worldAny good linux voice changer?
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    3 months ago

    Do you guys have any recommendation for a voice changer to process these audio files?

    I’m not totally sure what you’re going for.

    If you want to transform spoken audio to a different sort of voice, then that’s one problem.

    But this Piper thing appears to be a text-to-speech software package, and I’d think that it’d be easier and provide a more-capable system to just obtain a different voice and re-generate the audio from the text, rather than generating the audio and then transforming it, unless I’m not getting what you’re going for.

    Like, here’s a project – which I have not used – to generate Piper voices from audio samples of speech.



  • tal@lemmy.todaytoSelfhosted@lemmy.worldHDD or SSD for a home server?
    link
    fedilink
    English
    arrow-up
    17
    ·
    edit-2
    4 months ago

    For any computer today, server or no, I’d probably default to SSD today unless I expected to be making use of a large store of files that I expected to access in serial, like a large movie collection or maybe a backup server that can play well with rotational drives.

    The only thing there that looks like it could be doing that is the Samba server, depending upon what the remote clients are doing with it (could be a movie server).

    In general, if you can fit your stuff on an SSD today, I’d get an SSD.

    You also can also add a rotational drive down the line if you run low on space and need inexpensive space for something that you’re going to access in serial, and use both; just move the bulk stuff to the rotational drive then.



  • Right now when updates get applied to the NAS, if it gets powered off during the update window that would be really bad and inconvenient require manual intervention.

    You sure? I mean, sure, it’s possible; there are devices out there that can’t deal with power loss during update. But others can: they’ll typically have space for two firmware versions, write out the new version into the inactive slot, and only when the new version is committed to persistent storage, atomically activate it.

    Last device I worked on functioned that way.

    you might lose data in flight if you’re not careful.

    That’s the responsibility of the application if they rely on the data to be persistent at some point; they need to be written to deal with the fact that there may be in-flight data that doesn’t make it to the disk if they intend to take other actions that depend on that fact; they’ll need to call fsync() or whatever their OS has if they expect the data to be on-drive.

    Normally, there will always a period where some data being written out is partial: the write() could complete after handing the data off to the OS’s buffer cache. The local drive could complete because data’s in its cache. The app could perform multiple write() calls, and the first could have completed without the second. With a NAS, the window might be a little bit longer than it otherwise would be, but something like a DBMS will do the fsync(); at any point, it’d be hypothetically possible for the OS to crash or power loss or something to happen.

    The real problem, that I need an nas for, is not the loss of some data, it’s when the storms hit and there’s flooding, the power can go up and down and cycle quite rapidly. And that’s really bad for sensitive hardware like hard disks. So I want the NAS to shut off when the power starts getting bad, and not turn on for a really long time but still turn on automatically when things stabilize

    Like I said in the above comment, you’ll get that even without a clean shutdown; you’ll actually get a bit more time if you don’t do a clean shutdown.

    Because this device runs a bunch of VMs and containers

    Ah, okay, it’s not just a file server? Fair enough – then that brings the case #2 back up again, which I didn’t expect to apply to the NAS itself.