Account Migration

stevebate · July 12, 2023, 3:23pm

I like that. But a standard import/export format might still be useful for platform-independent archival and backup/restore purposes supported by third-party tools.

tchambers · July 12, 2023, 3:34pm

Given Threads statement that THEY will support account migration out to Mastodon and other ActivtyPub networks would seem a critical thing to standardize fully in ways that we want for the good of the Fediverse platforms.

I did some inventory on which platforms now support migrating Mastodon accounts into, and supports exporting social graphs OUT to Mastodon and it seemed like this:

Mastodon (of course)
Calckey (including the ability to move posts)
Pleroma
Mitra

Do I have that list right? Would be good to include in any documentation we do which Fedi platforms support this defacto standard now, and the full standard in the future.

stevebate · July 12, 2023, 5:00pm

Who knows what Threads support for Mastodon account migration means? It could be that they support the Move activity and that’s it. “Migration” doesn’t necessarily imply any data is being transferred in the process. I’m guessing Meta doesn’t even know what it means yet. The W3C representative to the SWICG is a self-described “newb to the ActivityPub standard”, which is perfectly fine, but it causes me to think whoever is promising account migration may be more involved in marketing than technology.

However, I think it would be good to evaluate what really exists for account migration, including data, in various platforms. Note that some platforms have both “backup” and “export” options. Backups typically cannot be imported for migration purposes. Mastodon, for example, allows backup of an account, including posts/statuses and associated media. However, that can’t be loaded or restored. It’s interesting though that it uses the “ActivityPub format” for the backups (multiple files of JSON-LD content).

Mastodon supports exporting 6 categories of other data that can be imported into other Mastodon instances (lists. blocks, mutes, etc.). These are saved in a platform-specific CSV file, but they can be loaded by other platforms that use the Mastodon data format. However, the other platforms typically only support importing some of these CSV files. It would be good to have a grid showing what specific data can be migrated between platforms and in what format.

The Pleroma documentation says it can “export and import a list of people you follow and block, in case instance’s database gets reverted or if you want to move to another server.” It doesn’t say the exported data is compatible with the Mastodon import CSV format.

I migrated from a Pleroma instance to a Mastodon instance in December 2022 and it was painful and frustrating. From what I remember, the “Move” activity part was the only thing that worked smoothly. I had to write code to convert some of the data to Mastodon formats for import and it didn’t all transfer. (It’s possible they improved those features since then, but this wasn’t very long ago.)

When you say Calckey has the ability to move posts, can it actually move posts between instances with different domain names? I’m curious how it does that. If they are doing that successfully, it could suggest how it could be done in other server implementations.

To be clear, I fully support your goal to rally the troops on these topics. I just don’t think we’ll do ourselves a favor by oversimplifying the problem or the current migration situation.

mk3 · July 12, 2023, 5:32pm

Sure. This sounds like a “guide”, which can be helpful. But I was assuming we were aiming for W3C specification level standards to come out of our discussions–not necessarily guides. But if so, I guess I’m unclear on the goals of our discussions here.

for item 6, do you mean redirecting just for actor/account requests or also for all the objects attributed to the actor on server A

Haha those are implementation details that I was trying to avoid getting into. But I was just raising the point that whatever needs to be done there would be platform-specific. Steps to deprecating an account that has been migrated away would depend on how users were accessing the migrated account on the platform. Maybe a 301 redirect is sufficient for one, but not for all. Maybe there are other things the platform needs to take into account, etc.

stevebate · July 12, 2023, 5:56pm

I was thinking of more than a guide. I had in mind something that would very specifically define the import/export format for interoperable ActivityPub data exchange. I think this could be an improvement (obviously more discussion is needed) over the current approach where each implementation defines its own ad-hoc import/export formats.

In any case, I thought we were discussing potential account migration-related FEPs here. A formal W3C specification effort is a very different topic.

tchambers · July 12, 2023, 8:25pm

Fully agree: who knows what Meta will eventually do, hence my attempt at shoring up what the current best practices and “defacto standards” would be ahead of that.

Agreed too on this:

" I fully support your goal to rally the troops on these topics. I just don’t think we’ll do ourselves a favor by oversimplifying the problem or the current migration situation."

I’m the first to say, my work right now is the most blunt crowdsourced instrument trying to start to move to clear and precise versions to come. Let me loop back with more hard research on exactly the Calkey migration process and what it entails.

stevebate · July 13, 2023, 11:53am

I created a temporary Calckey instance. It looks like the post export is export -only, like the Mastodon post backup. I don’t see an option for importing it. The posts are stored in a proprietary (and lossy, from what I can see) JSON format. Media is not part of the export. It looks like CalcKey has a “drive” feature where you can access media and other uploaded or exported files, but I don’t see a way to do a bulk download from the drive.

I don’t think this feature should be included in any discussions about account migration.

hamishcampbell · July 13, 2023, 12:41pm

Is there any resion we can’t have a standard export, import format for all our ActivityPub codebase:

Account - follows - following - profile
content
media

The rest can be preparatory (and lossy), the basics should be able to move from any codebase to any codebase?

melvincarvalho · July 13, 2023, 1:13pm

JSON-LD was designed to do this, and it does provably work as an export and merge tool. To the extent that AP complies with JSON-LD (it doesnt fully, but imho should) export and merge should fall out for free. Though migration involves more than that, for example changing followers.

tchambers · July 13, 2023, 10:37pm

Here was one suggestion for formats for social media post content, from a open standard used in blog post content migration.

https://tantek.com/2023/112/t2/account-migration-post-blog-archive-format

stevebate · July 14, 2023, 1:01am

It’s an interesting idea. The Mastodon user data export is similar. It has the post contents and media and some other related data like bookmarks and likes in JSON-LD ActivityStreams 2.0 format. I can imagine a tool that could import that archive and rewrite the URIs for a new account location. Rewriting the URIs is the easy part (ignoring that the old URIs are still out there and would need redirection or be broken). The hard part is recreating the post relationships (initial posts versus replies, conversations, etc.). In that sense, I think the note/status relationship graph may be a bit more complex than for blog articles. If somebody has an idea about how to do this part, I’d be willing to write code for a proof-of-concept using the Mastodon export.

bumblefudge · July 14, 2023, 10:00am

I think realistically we might be looking at a whole cluster of interlocking FEPs, and if they’re each breaking off a thin layer of the problem, they might be composable enough to work across very different instances. Most of these won’t be finalizable until they’ve been prototyped and those prototypes composed/tested against each other. Which is to say, not an unmanageable project, but a pretty hard coordination challenge without someone serious project management hours getting burnt. Entirely doable, though, and perhaps even in a short span of time if we can all find the bandwidth/budget to move quickly and trust each other and the plan we come up with together.

I think Step 4 is far from the only FEP worth writing here, but the cross-dependencies get complicated real quick. I have been thinking that a number of different topics coexist under the vague heading of “migration” and here is my list, slicing things as thinly as possible into the maximum number of things I can imagine being a FEP (perhaps many are not worth FEPing or worth combining, but thinnest atomic unit is a classic Project Manager exercise):

Data model for how Server A exports profiles, subscriptions, follows, etc
Parsing rules for ^ making explicit what Instance B does with that data model, including entries and objects it does not support
- e.g., does it store in a persisted export file to merged into future export file if they later want to move to Server C, which does support them? how does that export file get linked to or concatenated to the future export file? etc.
Protocol/interface for Instance B checking Instance A for valid account
- Note: if this protocol could take a dependency on per-account key material, such as on FEP-521a, this gets a lot easier. Could just be a wellknown/did-web pattern, Instance A could sign import request, etc etc.
Protocol/interface for Instance B telling Instance A import was succesful and can tombstone
- note that there should be a FEP-521a section, since key material must also be tombstoned/redirected…
Server A behavior and data model for “tombstoned” accounts, including redirects
Location-indepedent Content-storage (e.g. “just put all uploads on IPFS” or equivalent self-certifying URI scheme) and protocol for layering “upload export/import” to the above (i.e. instance B can fetch and pin all CIDs it finds in export file if Instance A includes them in export)
As this comment points out, there is a crucial moderation layer in export/import-- most instances wouldn’t want to import data unless they had a way to re-moderate that content or search through records of prior moderation against a known schema (assuming they even have resources to do so, or a way to bill the importer for that resource spend…)
- excellent prior art here

</ takes project manager hat off and collapses into an exhausted puddle>

mk3 · July 14, 2023, 11:48am

What is “finalizable” in this sense? Like I mentioned in my earlier message, I’m unclear on what an FEP being “finalized” means. Sure I see the repo, I see FEPs. But if they don’t lead to changes/updates in the specifications or working groups, and no platform implements the changes FEPs propose, we’re right back where we started. For that reason, I’m going to hold off on leaving any more feedback here until there’s some clarity about what the ultimate goal is for the thread.

bumblefudge · July 14, 2023, 12:13pm

That’s entirely fair, I think you’ve stuck your finger right in the most optimistic leap in my thinking as a project manager!

Currently a FEP is “final” when the document has passed editorial review to the mutual satisfaction of proposer and editors. That could be considered necessary but unsufficient to your definition of “finalized” and mine.

What I really meant by “finalized” was “two or more independent implementations doing it in a way that’s demonstrably interoperable”, which is a bunch of handwavey nonsense unless we include some kind of test suite, with at least compilable if not production code on two implementations that passes it. There is work underway elsewhere on this very socialhub towards a test harness for core spec that could be extended by some optional tests and vectors for specific FEPs-- I’m assuming in my proposal here that that parallel work continues and arrives at useful place for this one to build on in time. If you’re taking inventory of the various unicorns I believe in and assessing how many bridges in Florida to sell me.

Once we have all that, we could turn that FEP+test suite+running code into a Community Note of the CG, and/or an input document to future W3C specs. But I assume this last step is not the driving factor for people-- it’s the demonstrable coöperation in production that lets us push back on alternative interpretations of what “migration” means if, say, a multi-million-user server was arguing there was only one reasonable path forward

lrhodes · July 14, 2023, 9:55pm

For several years now, Anna has operated @anna@beta.example.com an account on Mastodon instance Beta, but has decided that she’d be a better fit on instance Gamma. Under the simplest conception of full account portability, she would be able to reserve @anna@gamma.example.net, input that new address in the preferences on her Beta account, confirm her intention to migrate, then sit back while the two instances coordinate the exchange of the ActivityPub Collections associated with her Beta account to her new account on Gamma. After a few minutes, she should be able to log into Gamma and have at her disposal all of her followers and follows, her blocks and mutes, her lists and complete post history from the prior account.

Currently, that is not how things work on most federated services, but it appears to be the ideal many of us are working toward. I assume that the technical hurdles are surmountable, albeit with trade-offs that warrant caution. What I want to concentrate on here are some social difficulties I’ve not seen widely discussed.

Case 1 involves a post set to Public visibility, meaning it would be included in public timelines and that it could, in principle, be viewed by any logged in account, subject to moderation restrictions. To simplify, assume that this post has no recorded interactions—no likes, no boosts, no replies. Nevertheless, Anna wants to take it with her when she migrates to Gamma. This is the least problematic case, and I think most people would say that it’s a fit candidate for migration. Send it along!

Things get murkier from there on. Case 2 is a direct message from Anna to @carl@beta.example.com. Direct messages on Mastodon are unencrypted—as are DMs on most current social media services, but famously so on Mastodon—which means that a sufficiently motivated admin could query the database to see the content of any DM on their server. Trust is thus a factor (even if only implicitly so) and Anna and Carl both felt comfortable having moderately confidential discussions on Beta because they both trusted their admin, Darius, not to pry. But while may trust her new future admin, Carl doesn’t—or, at least, doesn’t know whether he should.

Should Beta transfer Anna’s DM to Gamma? “Full account portability” would seem to suggest that it should. Are her DMs not part of her account? There may be information in the DM (Carl’s email address, for example) that she’d like to retain for later use. But transferring it potentially violates Carl’s expectation that DMs will only be accessible by certain identifiable parties. Maybe Gamma’s administrator bears a private animosity toward Carl. Either way, making unencrypted DMs portable breaks the social agreement implicit in the DM function. You could say that, at least in potentia, it breaks DM functionality altogether.

One solution, of course, is to redesign DMs around robust encryption. Until that happens, though, DMs should probably migrate only with explicit consent from all tagged accounts, or not at all.

Case 3 concerns a string of posts Anna made during the last election about a sensitive political issue. Instance rules on Gamma allow posts on the subject, but only behind a content warning. Beta doesn’t require content warnings, and Anna didn’t use them on that post. Gamma’s admin is unaware of those posts—they were made more than a year ago—and Anna has forgotten them, so unless someone catches them before her migration, they’ll soon be located on Gamma in violation of its rules. This may not seem like a particularly big deal, but Gamma members take the rule seriously, in part because some of them have PTSD related to the issue, and everyone wants those members to feel welcome. Maybe no one will go back and discover Anna’s political posts, but they might, and their presence there circumvents precautions those members may have taken against having the topic federated into their timelines.

I want to emphasize two points here. One is that, without some form of audit capability, full account portability sharply raises the risk of post hoc rules violations. The other is that, to the extent that an ActivityPub instance hosts a community, the members of that community are stakeholders in those rules. Should they have a say in whether someone can migrate rule violations into the instance? Is it even practical to audit the contents of an incoming account? I don’t see an easy technical fix here—at least, not one that avoids making full account portability more trouble than most people would feel it’s worth.

For Case 4, assume that Beta is run using the Hometown fork of Mastodon. Instance members can flag a post so that the server doesn’t pass it to federated instances, so that the post is only visible to accounts on Beta. This lends itself to lots of use cases, but Beta uses it mostly to discuss instance governance. Anna has been reasonably involved in those discussions, which are conducted on the assumption of discretion. It would be easy for people who weren’t involved to misinterpret them or blow certain disagreements out of proportion, which is why Darius would prefer that they weren’t exposed to the members of another instance. To complicate matters, Gamma is a standard Mastodon instance, so the flag marking those posts as local-only would likely get dropped in the transfer, allowing those posts to federate throughout the network.

Should Darius, as Beta’s admin, have an easy way to prevent those posts from migrating? That, of course, would break Anna’s expectation of full account portability. Allowing them to migrate, though, would break everyone else’s expectation of localization. And allowing Darius to assert a partial restriction could be taken as a general argument for giving admins veto power on account portability.

The technical solutions here seem rather straightforward. A site-wide setting could allow admins to withhold local-only posts in the case of account migration, with notifications informing the account-holder of the restriction. Or local-only posts could be converted to DMs–though that might still potentially expose them to other admins, as in Case 2. Either way, the dilemma points, I think, to the broader principle that some activities belong as much to their context as to the accounts participating in them—a principle that might seem rather obvious in the case of explicitly localized posts, but which, to the extent that it asserts limitations over their desire for full account portability, many people would likely reject.

Case 5 is Anna’s fifth reply in a conversation with Frank on instance Delta, a conversation they were able to have because Beta and Delta are federated. Frank, however, has blocked the entire Gamma domain, so the conversation would not have been easily accessible from that instance prior to Anna’s decision to migrate. How should Beta treat the request to transfer that reply to Gamma?

In principle, this may seem unproblematic. Since Beta and Gamma are federated, a standalone post created on Beta would appear on Gamma, even if the author had tagged Frank in the text. And while Anna’s reply to Frank might not have federated to Gamma on its own, @emily@beta.example.com could have boosted it into the timelines of accounts that follow Emily from Gamma. From a safety and privacy point of view, migrating Anna’s reply to Frank looks like a variation on a question that has already been answered.

But then, what happens to the record of their past conversation? Assuming that Frank’s domain block holds, the connection between their responses to one another will effectively be broken. Frank will no longer be able to see Anna’s responses, and Anna will no longer be able to see the context of her replies. Logically, this is a predictable consequence given the federation parameters in this case, but is that the behavior Anna would expect? If full account portability was a means of preserving a record of her social interactions, this effectively undermines that goal. And Frank may one day decide to revisit that exchange only to find that a decision he made before the conversation has effectively blocked half of the conversation months after the fact.

There may be solutions to the technical problem of a broken public reply chain (e.g. fetching the thread from an unauthenticated view), but it’s impossible to anticipate whether they’re desirable in most cases since breaking the reply chain may or may not be the desired result: Perhaps Anna is moving to Gamma in order to get away from Frank, or maybe Frank really wants to be insulated from anyone who would associate themselves with Gamma.

At the very least, account nomads need to be notified in advance that migrating a post archive could break their prior interactions with other accounts. Full account portability is a simple concept masking a complex range of potential results; it’s important to set expectations accordingly.

More generally, Case 5 serves as an illustration of the principle that social belonging may be multivalent. In technical terms, the entity that “owns” Anna’s reply may be subject to change, but can be described in fairly unambiguous terms using the language of the ActivityPub protocol. Socially, though, the context it relies on for its meaning consists not just of Anna, but also of Frank, and not just of those two people but also, as the dilemma indicates, of the relationship between the instances that host their accounts. Removing it from one or more of those contexts threatens to break its meaning. Will Anna still want that decontextualized post once she has made the move to Gamma? Is full account portability as valuable to her if the result is an account dotted with orphaned replies?

To some extent, the dilemmas in each of these case examples arise because our social media behavior is often calibrated for a narrow context, even if we generally prefer to see ourselves as radically independent individuals standing apart from the network as a whole. Most conceptions of full account portability are grounded in the latter point of view. But that’s not how the fediverse is structured, nor is it how most of us navigate it. The design of account portability should take into account the social principles pointed to here. Otherwise, we’re likely to undermine much of what we value about the spaces we’ve built here.

stevebate · July 14, 2023, 10:50pm

Very nice analysis. The issues are roughly analogous to physical migration (of a household, for example) to another legal jurisdiction (another country or another state in the USA). The laws of those jurisdictions may prevent you from migrating all your stuff. There may be restrictions on the transfer of plants, pets, firearms, health insurance policies, family members (children of separated parents), and so on. Many of the issues you describe are likewise a side-effect of decentralized, local governance in the Fediverse.

even if we generally prefer to see ourselves as radically independent individuals standing apart from the network as a whole

If someone really wants to be independent, I think that the only solution is a single-member, self-hosted instance with a domain name you control. Otherwise, you don’t really own the material you post either publicly or privately.

tchambers · July 17, 2023, 1:18pm

From Kinoa, the Calkey creator:

Perhaps a near term goal:

Catagorize the features of account migration NOW used by Mastodon - and then those also used in common by Calckey, Pleroma, Mitra, and take that “defacto standard” now and make it a “De Jure” standard via a fep that ONLY defines those feature sets… which I belive they all support the move of account name, lists, followers, who you follow, and blocklists.

… then move on to a new fep that grapples with post migration too?

stevebate · July 17, 2023, 4:47pm

Looking at Mastodon, Pleroma and CalcKey export/import, the common data sets I saw were “who you follow”, (following), blocks and mutes. In Mastodon, you can export lists but not import them (I don’t know why). I also don’t know of a way to to import a profile in Mastodon (which I’m assuming is “account name”). Mastodon doesn’t export followers. I guessing this is because it will send Move activity messages to them during a migration.

trwnh · July 18, 2023, 5:51pm

Move is only for signaling to followers that they should re-Follow you at a new actor. all other “datasets” are obtained via separate CSV exports, except for the “account archive”, which contains AS2 documents representing collections for the outbox, likes, and so on.

macgirvin · July 18, 2023, 9:37pm

I’ve created a number of projects that provide full migration within our family of servers and services (and partial migration from other services), and there may be privacy rules or restrictions in place. Both the exporting and importing sites are expected to honor the policies and requirements of their instances. We are not currently performing an audit or reporting content which was not migrated, but that would be a nice outcome. Currently the content is simply refused for export or silently dropped on import. It can also be unconditionally removed by moderators or the admin if it is found to be incompatible with the site policies.

Basically we’ll try and import everything presented. If the sending site does not wish this, it shouldn’t go into an export file and/or should return 403 on direct access. We normally provide everything on export, because it is your backup, and it all belongs to you. We don’t differentiate whether you’re storing it as a backup archive or syncing it to one of your cloned channels.

We do block importation of content from authors or sites which are blocked on this site or exceeds your cloud storage quota. We had long discussions about this and agreed this was the most practical action. There is also a quota for total number of posts, but this is rarely set by the instance admin. If it is, and your export file contains more posts than we allow, additional posts will be dropped. For this reason I’d probably recommend that post archives be in reverse chronological order so that if anything gets dropped it’s the ancient stuff.

We don’t concern ourselves with the number of characters in a post, but some projects do - and this could be a huge problem for somebody if suddenly their ten years of blog posts had to be capped at 500 characters each or photos removed from posts because they attached more than 4 photos to some of them. We can also post in HTML.

So this makes it very difficult for Mastodon to import our content. Most of our posts would be truncated and photos removed and all the formatting lost. We can easily import theirs (I’m currently working to add content from that project to our migrator) .