• mojo_raisin@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 months ago

    Nothing sinister, we just don’t delete what we say we delete. Instead we keep it in your profile to feed the algorithms and set the “deleted” flag to make you think it’s gone.

    • Thann@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      They don’t care about your security or privacy, they care about being the exclusive vendor of your personal information.

    • Simon Müller@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      I mean, to be completely fair, that’s how data storage works.

      We cannot really just make data disappear, so we let it get overwritten instead

      • mojo_raisin@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 months ago

        But clearly the data is not overwritten and this was intentional. How do I know? Because that would amount to a massive amount of data, if it was de to a bug in Apple software or underlying filesystems, it would be detected in monitoring systems “Hey, we’re using 10x the data we should be, maybe we should look into it”.

        The mistake was in the flag code that was supposed to fool us.

        • Simon Müller@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 months ago

          no when I say “overwritten” I mean that the area is set as deleted in the filesystem and the next time something writes to that area the data that was there before is disregarded.

          • barsoap@lemm.ee
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 months ago

            and the next time something writes to that area the data that was there before is disregarded.

            A single overwrite might not be enough to defeat physical forensics because shadows of the old data persist in how the new data is stored. Also when it comes to SSDs you might be waiting a long time for the data to get overwritten as the drive will wear-level its erm sectors (what are those things called with SSDs?).

          • mojo_raisin@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 months ago

            So are you saying that they suffered from a filesystem bug that caused deletion failure? I’d imagine they use standard filesystems on their backend, I haven’t heard about any bugs like this.

            If you ask me, what’s more likely, that a company known for shitty behavior lies about deleting files so they can continue to use that information to profit, – OR – that they are experiencing a filesystem bug on their backend, I’ll choose the former.

            • brbposting@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              0
              ·
              7 months ago

              Undeleting nudes

              That’s iPhone

              Seriously: I don’t think the cost benefit is there to intentionally make a maneuver like this. Any crap they pull needs to have a perfectly proper explanation, with our agreement to a specific term buried somewhere in their policies. Can only imagine how much money they blew throwing these billboards up all over the San Francisco Bay area. We have to buy Apple over Google for ostensible privacy gains, and Apple has to lock us in to their walled gardens to make up for their comparatively smaller ad/data business.

              This post assumes Apple is aethical (that’s like amoral but for ethics right?) but still a self-interested economic actor. They can’t let short-term greed get in the way of long-term greed!

              • mojo_raisin@lemmy.world
                link
                fedilink
                English
                arrow-up
                0
                ·
                7 months ago

                Seriously: I don’t think the cost benefit is there to intentionally make a maneuver like this.

                You might be right

                They can’t let short-term greed get in the way of long-term greed!

                lol

            • Simon Müller@sopuli.xyz
              link
              fedilink
              English
              arrow-up
              0
              ·
              7 months ago

              no I don’t believe a damn word of what apple’s gonna say on this, I just wanted to get the message out there that generally file deletion works by allowing data to be overwritten, so if the images are local this could very well just be that either it’s showing data that hasn’t been overwritten yet or it accidentally brought things out of the “recently deleted” depending on how long ago it was deleted.

      • lurch (he/him)@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 months ago

        the shred command in Linux tries to do this, but it may not work if the hardware moves rewritten data blocks around to mitigate wear.

        • tal@lemmy.today
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 months ago

          shred doesn’t even necessarily work at the OS level. If you use something like ext3 and I assume ext4, normally when you overwrite data in a file, you’re not overwriting data even at the logical level in the block device. Journalling entails that you commit data to somewhere else on the disk, then update the metadata atomically to reference the new data.

      • solarvector@lemmy.zip
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 months ago

        That’s skipping over the fact that recovering deleted data, even if it isn’t overwritten, is not an “oops”. It it takes extra effort, and if that data isn’t being protected it would be overwritten incidentally as drives are used.

        There is a big difference in a database between “flagging” data and actually removing the association of the data to the database.

      • Forester@yiffit.net
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        7 months ago

        Proper deletion should include writing all ones or all zeroes to the block but y’all be lazy as fuck.

        • AProfessional@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          7 months ago

          That just makes no sense to do, modern storage is write limited. As long as you used encryption the old bits mean nothing to anyone but you.

        • EvilBit@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          7 months ago

          I’m not an expert, but wouldn’t proper deletion be writing random ones and zeroes to the block? Multiple times?

        • cm0002@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          7 months ago

          Only necessary on the ol spinning rust, with SSDs not only is it completely unnecessary, but it also burns extra writes.

          Spinny’s store data magnetically on the platter with 1s and 0s, SSDs store data on the NAND as a held charge. If there’s a charge in the block it’s a 1 if there’s no charge it’s a 0.

          With spinny’s, a file gets marked as “deleted” but the residual magnetic 1s and 0s will remain on the platter until eventually overwritten

          With SSDs a file gets marked “deleted” and within no more than a few minutes TRIM comes along and ensures the charge on the NAND is released for that data, there’s no residuals to worry about like with spinny’s and is in fact necessary to ensure decent lifespans.

          • Verat@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 months ago

            But wouldn’t TRIM be the deleting he is requesting? Removing the charges would be setting all the bits in that block to the same value.

          • brbposting@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 months ago

            Wow, the SSD can hold the charges perfectly while unplugged for ages? Amazing.

            In a post apocalyptic world where I am in charge of building a storage drive and I’m given all the instructions and fabs, the world is going without storage.

            • davidgro@lemmy.world
              link
              fedilink
              English
              arrow-up
              0
              ·
              7 months ago

              Wow, the SSD can hold the charges perfectly while unplugged for ages? Amazing.

              Yup. Before flash memory, devices like video game cartridges which had game saves actually needed a battery to power the memory holding the saves.

        • Simon Müller@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 months ago

          yeah cuz for normal, day-to-day use that’s exponentially slower the more you’re deleting

          You can do that when you wipe something.

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      That’s how a lot of people handle deleted data in database, it’s literally just a flag. That’s why there’s a recommendation to edit Reddit posts before deleting them, to ensure they’re actually overwritten so they can’t just be restored.

      • fishpen0@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 months ago

        Every time someone says something like this I have to explain CDC and regular old backups. There’s no way in hell Reddit doesn’t keep cold and hot backups of their shit. And while Reddit is unlikely to be doing CDC for soc2 or other compliance reasons, it’s the easiest method to capture data for analytics purposes.

        CDC stands for change data capture. It’s generally done with databases by streaming the change log or ref log to a bucket or a service like Kafka where you can fast forward and rewind the log queue to see the state of the DB at any point in time. Even if you edit your comments it’s likely sitting in a Kafka topic or a snowflake bucket outside of the DB or cache used for the presentation layer.

        Zero large scale websites operate with a truly single data store. There is always another layer that your user operations don’t impact

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 months ago

          Yes, that’s certainly possible, but it’s also out of my control. I have basically three options:

          1. Delete account - we know this doesn’t delete comments
          2. Delete comment - “seems” to delete comments, but we’ve seen comments get restored - so probably using a “deleted” flag
          3. Edit comment with nonsense and when delete - should poison comment if they’re just using the deleted flag

          That’s it. There’s no guarantee it works, but it has a much higher chance of working than the other two.

          And there’s a good chance they delete old backups. Hosting every edit is expensive, so there’s a decent chance they clean up old data after some months.

          • fishpen0@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 months ago

            In 2019 the total size of the text stored by Reddit was only 50TB. A Petabyte of data in cold storage is only 12k a year so even if they 500x in size since 2019 (very unlikely) it’s a drop in their ARR. given they sell the data for advertising and for AI, they are not deleting it. Reddit also self hosts a lot of their infra (they used to present their architecture at kubecon) so the storage costs would be even lower