In Mastodon media can only be used by owners and only be associated with
a single post. We currently allow media to be associated with several
posts and until now did not limit their usage in posts to media owners.
However, media update and GET lookup was already limited to owners.
(In accordance with allowing media reuse, we also still allow GET
lookups of media already used in a post unlike Mastodon)
Allowing reuse isn’t problematic per se, but allowing use by non-owners
can be problematic if media ids of private-scoped posts can be guessed
since creating a new post with this media id will reveal the uploaded
file content and alt text.
Given media ids are currently just part of a sequentieal series shared
with some other objects, guessing media ids is with some persistence
indeed feasible.
E.g. sampline some public media ids from a real-world
instance with 112 total and 61 monthly-active users:
17.465.096 at t0
17.472.673 at t1 = t0 + 4h
17.473.248 at t2 = t1 + 20min
This gives about 30 new ids per minute of which most won't be
local media but remote and local posts, poll answers etc.
Assuming the default ratelimit of 15 post actions per 10s, scraping all
media for the 4h interval takes about 84 minutes and scraping the 20min
range mere 6.3 minutes. (Until the preceding commit, post updates were
not rate limited at all, allowing even faster scraping.)
If an attacker can infer (e.g. via reply to a follower-only post not
accessbile to the attacker) some sensitive information was uploaded
during a specific time interval and has some pointers regarding the
nature of the information, identifying the specific upload out of all
scraped media for this timerange is not impossible.
Thus restrict media usage to owners.
Checking ownership just in ActivitDraft would already be sufficient,
since when a scheduled status actually gets posted it goes through
ActivityDraft again, but would erroneously return a success status
when scheduling an illegal post.
Independently discovered and fixed by mint in Pleroma
1afde067b1
In MastoAPI media descriptions are updated via the
media update API not upon post creation or post update.
This functionality was originally added about 6 years ago in
ba93396649 which was part of
https://git.pleroma.social/pleroma/pleroma/-/merge_requests/626 and
https://git.pleroma.social/pleroma/pleroma-fe/-/merge_requests/450.
They introduced image descriptions to the front- and backend,
but predate adoption of Mastodon API.
For a while adding an `descriptions` array on post creation might have
continued to work as an undocumented Pleroma extension to Masto API, but
at latest when OpenAPI specs were added for those endpoints four years
ago in 7803a85d2c, these codepaths ceased
to be used. The API specs don’t list a `descriptions` parameter and
any unknown parameters are stripped out.
The attachments_from_ids function is only called from
ScheduledActivity and ActivityDraft.create with the latter
only being called by CommonAPI.{post,update} whihc in turn
are only called from ScheduledActivity again, MastoAPI controller
and without any attachment or description parameter WelcomeMessage.
Therefore no codepath can contain a descriptions parameter.
Due to JSON-LD compaction the full address of public scope
may also occur in shorter forms and the spec requires us to treat them
all equivalently. To save us the pain of repeatedly checking for all
variants internally, normalise inbound data to just one form.
See note at: https://www.w3.org/TR/activitypub/#public-addressing
This needs to happen very early, even before the other addressing fixes
else an earlier validator will reject the object. This in turn required
to move the list-tpye normalisation earlier as well, but since I was
unsure about putting empty lists into the data when no such field
existed before, I excluded this case and thus the later fixing had to be
kept as well.
Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/670
The API parameter is not a timestamp but an offset.
If a sufficient amount of time passes between the tests
expires_at calculation and the internal calculation during processing
of the request the strict equality assertion fails. (Either a direct
assertion or indirect via job lookup).
To avoid this lower comparison granularity.
literally nothing uses C2S AP, and it's another route into core
systems which requires analysis and maintenance. A second API
is just extra surface for potentially bad things so let's take
it out back and obliterate it
"id" is used for the canonical link to the AS2 representation of an object.
"url" is typically used for the canonical link to the HTTP representation.
It is what we use, for example, when following the "external source" link
in the frontend. However, it's not the link we include in the post contents
for quote posts.
Using URL instead means we include a more user-friendly URL for Mastodon,
and a working (in the browser) URL for Threads
previously we would uncritically take data and format it into
tags for static-fe and the like - however, instances can be
configured to disallow unauthenticated access to these resources.
this means that OG tags as a vector for information leakage.
_technically_ this should only occur if you have both
restrict_unauthenticated *AND* you run static-fe, which makes no
sense since static-fe is for unauthenticated people in particular,
but hey ho.
Per the XRD specification:
> 2.4. Element <Alias>
>
> The <Alias> element contains a URI value that is an additional
> identifier for the resource described by the XRD. This value
> MUST be an absolute URI. The <Alias> element does not identify
> additional resources the XRD is describing, **but rather provides
> additional identifiers for the same resource.**
(http://docs.oasis-open.org/xri/xrd/v1.0/os/xrd-1.0-os.html#element.alias, emphasis mine)
In other words, the alias list is expected to link to things which are
not just semantically the same, but exactly the same. Old user accounts
don't do that
This change should not pose a compatibility issue: Mastodon does not
list old accounts here (See e1fcb02867/app/serializers/webfinger_serializer.rb (L12))
The use of as:alsoKnownAs is also not quite semantically right here
(see https://www.w3.org/TR/did-core/#dfn-alsoknownas, which defines
it to be used to refer to identifiers which are interchangable) but
that's what DID get for reusing a property definition that Mastodon
already squatted long before they got to it
The newest git HEAD of MIME already knows about APNG, but this
hasn’t been released yet. Without this, APNG attachments from
remote posts won’t display as images in frontends.
Fixes: akkoma#657
This protects us from falling for obvious spoofs as from the current
upload exploit (unfortunately we can’t reasonably do anything about
spoofs with exact matches as was possible via emoji and proxy).
Such objects being invalid is supported by the spec, sepcifically
sections 3.1 and 3.2: https://www.w3.org/TR/activitypub/#obj-id
Anonymous objects are not relevant here (they can only exists within
parent objects iiuc) and neither is client-to-server or transient objects
(as those cannot be fetched in the first place).
This leaves us with the requirement for `id` to (a) exist and
(b) be a publicly dereferencable URI from the originating server.
This alone does not yet demand strict equivalence, but the spec then
further explains objects ought to be fetchable _via their ID_.
Meaning an object not retrievable via its ID, is invalid.
This reading is supported by the fact, e.g. GoToSocial (recently) and
Mastodon (for 6+ years) do already implement such strict ID checks,
additionally proving this doesn’t cause federation issues in practice.
However, apart from canonical IDs there can also be additional display
URLs. *omas first redirect those to their canonical location, but *keys
and Mastodon directly serve the AP representation without redirects.
Mastodon and GTS deal with this in two different ways,
but both constitute an effective countermeasure:
- Mastodon:
Unless it already is a known AP id, two fetches occur.
The first fetch just reads the `id` property and then refetches from
the id. The last fetch requires the returned id to exactly match the
URL the content was fetched from. (This can be optimised by skipping
the second fetch if it already matches)
05eda8d193/app/helpers/jsonld_helper.rb (L168)63f0979799
- GTS:
Only does a single fetch and then checks if _either_ the id
_or_ url property (which can be an object) match the original fetch
URL. This relies on implementations always including their display URL
as "url" if differing from the id. For actors this is true for all
investigated implementations, for posts only Mastodon includes an
"url", but it is also the only one with a differing display URL.
2bafd7daf5 (diff-943bbb02c8ac74ac5dc5d20807e561dcdfaebdc3b62b10730f643a20ac23c24fR222)
Albeit Mastodon’s refetch offers higher compatibility with theoretical
implmentations using either multiple different display URL or not
denoting any of them as "url" at all, for now we chose to adopt a
GTS-like refetch-free approach to avoid additional implementation
concerns wrt to whether redirects should be allowed when fetching a
canonical AP id and potential for accidentally loosening some checks
(e.g. cross-domain refetches) for one of the fetches.
This may be reconsidered in the future.
If it’s not already in the database,
it must be counterfeit (or just not exists at all)
Changed test URLs were only ever used from "local: false" users anyway.
To save on bandwith and avoid OOMs with large files.
Ofc, this relies on the remote server
(a) sending a content-length header and
(b) being honest about the size.
Common fedi servers seem to provide the header and (b) at least raises
the required privilege of an malicious actor to a server infrastructure
admin of an explicitly allowed host.
A more complete defense which still works when faced with
a malicious server requires changes in upstream Finch;
see https://github.com/sneako/finch/issues/224
E.g. *key’s emoji URLs typically don’t have file extensions, but
until now we just slapped ".png" at its end hoping for the best.
Furthermore, this gives us a chance to actually reject non-images,
which before was not feasible exatly due to those extension-less URLs
This actually was already intended before to eradict all future
path-traversal-style exploits and to fix issues with some
characters like akkoma#610 in 0b2ec0ccee. However, Dedupe and
AnonymizeFilename got mixed up. The latter only anonymises the name
in Content-Disposition headers GET parameters (with link_name),
_not_ the upload path.
Even without Dedupe, the upload path is prefixed by an UUID,
so it _should_ already be hard to guess for attackers. But now
we actually can be sure no path shenanigangs occur, uploads
reliably work and save some disk space.
While this makes the final path predictable, this prediction is
not exploitable. Insertion of a back-reference to the upload
itself requires pulling off a successfull preimage attack against
SHA-256, which is deemed infeasible for the foreseeable futures.
Dedupe was already included in the default list in config.exs
since 28cfb2c37a, but this will get overridde by whatever the
config generated by the "pleroma.instance gen" task chose.
Upload+delete tests running in parallel using Dedupe might be flaky, but
this was already true before and needs its own commit to fix eventually.