It’s excruciatingly obnoxious to have to rely on third party sources for what should be a first-party feature.

Like, I select all and then search a query. “Oh no, nobody on your server used a third party service to find it, so you won’t see it here.”

Like, how short-sighted is that, really? If I search for a string in the ‘all’ servers, I should have a list of ‘all’ the servers containing that string.

It’s a really simple concept. Not sure why this post even has to be made, but I’m wondering if there’s something I can do to make these ‘features’ more intuitive.

  • Blaze (he/him)@sopuli.xyz
    link
    fedilink
    arrow-up
    5
    ·
    1 year ago

    Just bringing this to everyone’s awareness, the issues is already tracked here: https://github.com/LemmyNet/lemmy/issues/2951

    From the Lemmy devs

    I think the lemmy-ui’s could very much benefit from a “global community discovery service” like https://browse.feddit.de , but integrated into the front ends. I’d of course prefer that each lemmy back-end do their own crawling of communities and instances, to make it as distributed as possible.

  • PupBiru@kbin.social
    link
    fedilink
    arrow-up
    3
    ·
    1 year ago

    totally understand the frustration, and i’m not going to try and invalidate it!

    … however, it’s definitely not a problem with a simple solution

    since anyone can start an instance, when you search “all”, where should it search? i don’t mean generally like “all the instances”, i mean where specifically? things like lemmy.world, lemmy.ml, kbin.social, etc are obvious… but what about lemmy.mydomainforfriends.social (not real but let’s pretend someone created their own little instance for friends there!)?

    let’s say you say yes that should be searched, okay… how does your instance know it’s there? does it tell all other instances that it exists at some point? where does IT get that list from? (the current solution to this is that your instance starts to “know about” an instance after someone interacts with it, but this has the problem you’ve described)

    let’s say that instance shouldn’t be searched… now, what are the rules (automatic id assume; not with human intervention) that would allow an instance to be added to some big list somewhere? also where is that list? now we’re back at problem 1: how do you store a federated list of servers?

    the problem gets even harder when you consider mastodon, pixelfed, peertube, etc… all these services interact: should all include them? only certain things in them?

    • bobman@unilem.orgOP
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      1 year ago

      since anyone can start an instance, when you search “all”, where should it search?

      Easy! It should search all the servers your server is federated with! Servers should contain a list of their community names that can be easily and quickly queried by other servers.

      • Zalack@startrek.website
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Federation isn’t opt-in though. It would be VERY easy to spin up a bunch of instances with millions or billions of fake communities and use them to DDOS a server’s search function.

        Searching current active subscriptions helps mitigate that vector a little.

        • Benj1B@sh.itjust.works
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          I would suggest that instances should have settings that allow them to decide whether to “advertise” a community list. With configurable settings like "all, “most active”, “top X”, or even a manually maintained list depending on the admins and instances preferences.

          Then your home instance, when searching, should have it’s own settings to decide what results it’s going to ping other servers for. Big/popular/high confidence instances can have an open all/all relationship, while you might query only the top 10 communities from unknown or new instances to handle the scenario you describe.

          Federation can be binary yes/no but there should be room to add more logic around enabling search on communities from your instance and controlling the search results from other instances. I don’t think the two are mutually exclusive, unless I fundamentally misunderstand how federation works!

        • bobman@unilem.orgOP
          link
          fedilink
          arrow-up
          0
          arrow-down
          1
          ·
          1 year ago

          I… don’t think you know what ddossing means but okay.

          Would it really be very easy? Especially considering once instances find your doing that, they just block you? Would it be worth people’s time?

          Is there any way around this, perhaps querying a global repository of federated instances and sorting them by popularity?

          In all honesty, you don’t have a point. If you did, third-party services already wouldn’t offer this. Seeing as they can, it’s clearly possible.

          • Zalack@startrek.website
            link
            fedilink
            arrow-up
            3
            ·
            edit-2
            1 year ago

            Sorry you’re right that I wasn’t being precise with my terminology. It’s not a DDOS but it could be used to slow down targeted features, take up some HTTP connections, inflate the target’s DB, and waste CPU cycles, so it shares some characteristics of one.

            In general, you want to be very very careful of implementing features that allow untrusted parties to supply potentially unbounded resources to your server.

            And yeah, it would be trivial to write a set of scripts that pretend to be a lemmy instance and supply an endless number of fake communities to the target server. The nice thing about this attack vector is that it’s also not bound by the normal rate limiting since it’s the target server making the requests. There are definitely a bunch of ways lemmy could mitigate such an attack, but the current approach of “list communities current users are subscribed to” seems like a decent first approach.

    • Waraugh@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      1 year ago

      So many options, doing none seems lazy. I can source all kinds of lists for my pihole to block traffic. I can put a lot of repos in my yum.conf. It’s not like this should be reliant on any one single source of truth. There could certainly be an open source list maintained. I’m surprised this is considered such a difficult problem with so many smart folks involved, I’m obviously really ignorant to how this stuff works. I just don’t get how a problem that seems to have been solved across a litany of technical products using shared sources in defederated environments is such an exotic hurdle here.

      • PupBiru@kbin.social
        link
        fedilink
        arrow-up
        0
        ·
        1 year ago

        okay so now you have a decentralised list with 1000 servers on it. does your instance… make 1000 requests when you search?

        • Waraugh@lemmy.dbzer0.com
          link
          fedilink
          arrow-up
          0
          arrow-down
          1
          ·
          edit-2
          1 year ago

          Lists can be cached and updated. Even if posts from all doesn’t include all active content it would be very manageable to have queries include communities across instances based on names and other fields. All this shit is already solved problems.

  • Destragras@kbin.social
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    Pleroma calls their equivalent of “All” the “Known Network” instead, which does a better job explaining what will show up there in my opinion.

  • moobythegoldensock@geddit.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    The simple explanation is your instance doesn’t “know” what’s out there. lemmy.world doesn’t know when lemmy.ml adds a community, and it doesn’t know when hypothetical.server pops up as a new instance. There’s not really a good way of knowing that without having a central repository, which defeats the purpose of a centralized platform.

    One thing you can do is use Lemmy Explorer to search for communities on other instances and subscribe to them. This will fill up All for everyone on your instance.

    • Waraugh@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Looks to me like lemmy explorer could just be sourced for results fairly easy. Even if it was just added as an additional source to the default listings. Similar to setting up yum repos etcetera. Is there a good reason this isn’t a thing? I know my use and exposure to communities is severely limited by the current cluster fuck of finding communities. I just don’t care enough to go further than searching in the app and closing out if nothing shows up. I realize my laziness contributes to my user experience but saying an instance doesn’t know what’s out there and then providing a site that will let me search for what’s out there doesn’t seem logical.

    • bobman@unilem.orgOP
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      1 year ago

      It should be able to search a list of communities available on other servers its federated with.

      This would be a very simple feature to implement and should not cause significant overhead.

  • InquisitiveFactotum@midwest.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Forgive what is probably a silly naive question…

    Can someone point me to an explanation of the federated architecture of lemmy? I haven’t found one yet that has helped me build a good mental model. I either get a step-by-step startup guide, or discussions on the merrits/demerits of a distributed system.

    I think I’ve pieced together that it’s basically independent “instances” of the machine each with their own communities within. Sort of like if there were multiple instances of reddit, each with its own r/aww or whatever. I don’t yet understand, however how these interact/relate/ovelap/collaborate…which I think is the basis for this thread.

    • larvyde@sh.itjust.works
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      when a user (let’s call them Kim) on one instance (let’s call it “Works”), subscribes to a community on another (let’s call that one “World”), Works creates a copy of the community on its own database. It also asks World to notify it when there is an update to the community – when there is a new post, new comment, up/downvote, something gets deleted, etc. Kim can now browse and interact with the community on Works. Works will also notify World when Kim does something in the community so everything syncs and everyone sees the same thing.

      So really, the problem OP is describing is simply a natural consequence of communities not existing on Works until someone subscribes to it.

      • InquisitiveFactotum@midwest.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Thanks, this makes sense. So, the last thing I’m wondering about is the redundancy/exclusivity of communities. For example, could there be a community called ‘gardening’ on the “Works” instance and also an independent community by the same name on “World” (before anyone is mutuallt subscribed)? Seems like it could… And if so, what happens when someone cross subscribes to ‘gardening’.

        Specifically, (from a user experience standpoint) do these redundant communities coelesce into one? Because some of the benefit of these communities (particularly the more niche) is pulling together the experts into one community.

        • larvyde@sh.itjust.works
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          The gardening community on World will be called gardening@World on Works. they will continue to be distinct communities, and you can subscribe to either or both independently

  • Jeena@jemmy.jeena.net
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    1 year ago

    What does ‘All’ mean to you?

    In this context it means all posts which are stored on the server you are on. And only things are stored which people subscribed to. It does not mean “‘all’ servers”.

    There are good reasons why the protocol has been designed like that, if you’re interested then you can find out about it. If not, reddit still exists for people who like it more.

    • Spuddlesv2@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      All means all. If it isn’t actually All (it isn’t) then it should be called something else.

      • Jeena@jemmy.jeena.net
        link
        fedilink
        arrow-up
        0
        ·
        1 year ago

        But it is all, just not the all you think, it’s all things the server is aware of, not all things in the universe.

        • Zetaphor@zemmy.cc
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          1 year ago

          This is not obvious to anyone who doesn’t have some understanding of how networking and federation work, which is most people. Especially if we’re talking about users who have only ever experienced centralized platforms.

          It should be called “Known Network” or something more transparent that doesn’t require an explanation of indexing

            • Zetaphor@zemmy.cc
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              It’s an understandable response. They were previously in a position where this was such an obvious concept that it didn’t merit any thought, and now they are required to have an understanding of networking and federation in order to understand how well actually this a fundamental part of how distributed systems work and isn’t technically a bug.

              From their perspective this seems like a fairly straightforward problem. Obviously (to us) it’s not, but the threshold for the fediverse shouldn’t be that you deeply understand federation if there’s ever going to be meaningful adoption.

              As an aside, your personal domain is timing out.

              • Jeena@jemmy.jeena.net
                link
                fedilink
                arrow-up
                0
                arrow-down
                1
                ·
                1 year ago

                Damn, thanks, I have a bad implementation of getting Twitter avatars and now that Twitter redirects everything which is not logged in my implementation goes into redirect hell every time someone opens a page with a Twitter comment. Perhaps I’ll find the time tonight to look for a fix.

                • Jeena@jemmy.jeena.net
                  link
                  fedilink
                  arrow-up
                  0
                  arrow-down
                  1
                  ·
                  1 year ago

                  It seems I was able to fix it by adding curl.max_redirects = 3 to my caching code. No idea why it would hang without it because it gets the image from Twitter just fine now too.

        • bobman@unilem.orgOP
          link
          fedilink
          arrow-up
          0
          arrow-down
          2
          ·
          edit-2
          1 year ago

          Uh… no it’s not.

          I’m sorry, but what you’re doing is actively making this service harder to use by suggesting that ‘all’ should only mean ‘the communities other community members have subscribed to that contain that string.’

          Where do the community members even find the the ones to subscribe to? Oh, they use a third-party service or ‘just know’ because… whatever reason.

          Gee, fediverse design strikes again. Sorry, it has to be said. It really does.

    • bobman@unilem.orgOP
      link
      fedilink
      arrow-up
      0
      ·
      1 year ago

      ‘All’ to me means “”“all”“” the servers my instance can connect to that contain that string.

      It’s a very simple concept.

      • Jeena@jemmy.jeena.net
        link
        fedilink
        arrow-up
        0
        ·
        1 year ago

        It’s just not very simple, quite the contrary, you would need to have a server park like reddit has it to store everuthing on every instance, the databases would be so big that you would need specialosts running just the database servers.

        • bobman@unilem.orgOP
          link
          fedilink
          arrow-up
          0
          arrow-down
          1
          ·
          1 year ago

          Oh yes, it really is.

          The implementation may not be easy, but the concept is very simple.

  • jet@hackertalks.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    Someone will implement it.

    The protocol itself is decentralized. Which is good.

    If a app wants to use a central service to search thats a option available to them.

    • Oisteink@feddit.nl
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      Userbase don’t care about how the tech works under the hood - user base sees no content and goes back to Reddit.

      • Blaze (he/him)@sopuli.xyz
        link
        fedilink
        arrow-up
        0
        ·
        1 year ago

        If they can’t bother with investigating the platform for 10 minutes, I think they should stay on Reddit and keep complaining about the awful app and website over there.

        • Oisteink@feddit.nl
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 year ago

          Winning tactic. I’d like to stay a place with people with knowledge and interesting viewpoint, regardless of their ability to find search services on other websites to locate content.

          • Blaze (he/him)@sopuli.xyz
            link
            fedilink
            arrow-up
            0
            ·
            1 year ago

            At this very moment, there is a choice between two options

            • an easy to use place managed by a company that see user data as resources to be sold to advertisers
            • an emerging platform where some features are still being implemented, but without any tracking of its users, and managed by volunteers

            Hopefully in the near future some features such as the one highlighted by OP will be integrated in the platform, but right now, it’s not, which is why I said that if people cannot search a bit about the current state of Lemmy, they should probably head back to Reddit. And I say that hoping that once the platform is polished enough, they’ll come back.

            • Oisteink@feddit.nl
              link
              fedilink
              English
              arrow-up
              0
              ·
              1 year ago

              How does posting and reading posts work - and how do you know that nobody tracks their users? I was under the assumption that admins of a node have totals access to data going in/out/through their instance.

              • Blaze (he/him)@sopuli.xyz
                link
                fedilink
                arrow-up
                0
                ·
                1 year ago

                Just have a look at the data accessed by the apps, both stores display them. It’s something else than the Reddit app.

                There might be some data agregation on the server side indeed, but compared to the ads promotion machine than Reddit has become (and even announced openly, with subreddits now being platform to promote products), it’s a completely different story.

                • Oisteink@feddit.nl
                  link
                  fedilink
                  English
                  arrow-up
                  0
                  ·
                  1 year ago

                  What app? As far as I’m concerned there’s no reason to believe that fediverse users aren’t tracked. Probably not all, but where there’s users interacting with each other discussing different subjects there’s money to be made, and data to sell to AI companies for training.