ContentVersion vs. Attachment vs. ContentDocumentLink: Salesforce File Storage Explained
If you've ever tried to query a file in Salesforce and wondered why there are four different objects involved — and why the same UI surface lists files from two of them but not the others — you're not alone. The Salesforce file model accumulated complexity over a decade of platform evolution, and the result is a set of four objects with overlapping but distinct responsibilities. This guide is the reference for what each one does, how they relate, and what it means when you're migrating, querying, or debugging files.
Why this is confusing
Salesforce went through a major file-storage architecture change around 2010–2012, when "Files" (built on Chatter Files / Content) was introduced as the modern alternative to the legacy Attachment object. Both systems coexist today. The Salesforce UI partially abstracts this — the "Notes & Attachments" related list shows both legacy Attachments and modern ContentDocuments, while the "Files" related list shows only modern files. So the same record can have files in two completely different storage models, with two different APIs, exposed in two different UI surfaces.
On top of that, the modern Files system splits into multiple objects to support versioning and many-to-many record sharing. The result: four objects (Attachment, ContentDocument, ContentVersion, ContentDocumentLink) where you'd expect one or two, with names that are easy to confuse.
Quick reference table
| Object | ID prefix | What it represents | Stores binary data? | Multi-record sharing? |
|---|---|---|---|---|
Attachment | 00P | A legacy file attached to one parent | Yes (Body) | No |
ContentDocument | 069 | The logical file (modern) | No | Yes (via ContentDocumentLink) |
ContentVersion | 068 | One version of a ContentDocument's binary | Yes (VersionData) | N/A |
ContentDocumentLink | 06A | Links a ContentDocument to a parent record | No | Implements it |
Attachment (legacy)
The simplest model. One Attachment represents one file attached to one parent record. The file binary is stored in the Body field of the Attachment itself. There's no version history — uploading a new version overwrites the old one (or, more commonly in practice, creates a new Attachment with a different name).
Key fields:
Id— 18-character ID, prefix00PName— filenameBody— base64-encoded file binary (limited to 25 MB total per record via REST)BodyLength— file size in bytesParentId— the parent record this is attached to (must be exactly one)OwnerId,CreatedById,CreatedDate— standard audit fieldsIsPrivate— controls visibility
Limitations of Attachment: no version history, no multi-record sharing, no file preview generation, and capped at 25 MB. Salesforce hasn't deprecated it but has stopped recommending it. Most orgs accumulated Attachments in their early years before Files became standard, and many of those records still exist.
ContentDocument (modern)
The modern equivalent — but with a twist. ContentDocument is the logical file, but it doesn't itself store the binary. Think of it as the "wrapper" — it has the file's title, type, and a reference to the latest published version, but the actual bytes live in ContentVersion records associated with it.
Key fields:
Id— prefix069Title— file name (without extension)FileType— extension / MIME typeFileExtensionLatestPublishedVersionId— the currentContentVersionOwnerIdSharingPrivacy,SharingOption— control multi-record sharing
ContentDocument has no ParentId — it isn't attached to a single parent. Sharing happens through ContentDocumentLink, described below.
ContentVersion
Where the actual file binary lives. Every ContentDocument has at least one ContentVersion. If users upload new versions of the same file, additional ContentVersion records are created, all pointing back to the same ContentDocument.
Key fields:
Id— prefix068VersionData— base64-encoded binary (the file content)ContentDocumentId— points to the parentContentDocumentIsLatest— true if this is the current versionVersionNumber— '1', '2', etc.ContentSize— file size in bytesTitle,Description,FileExtension,FileType— metadataFirstPublishLocationId— used during initial creation to specify the parent record
Quirk: when you upload a new file via the API, you create a ContentVersion directly. Salesforce auto-generates the parent ContentDocument on insert. You don't insert a ContentDocument separately. Setting FirstPublishLocationId on the ContentVersion insert tells Salesforce which record to attach the file to — Salesforce creates the ContentDocumentLink automatically.
ContentDocumentLink
The junction object. One row per (file, parent record) pair. A single ContentDocument can have many ContentDocumentLinks — that's how the same file can be attached to multiple Accounts, Contacts, or Cases without being duplicated in storage.
Key fields:
Id— prefix06AContentDocumentIdLinkedEntityId— the parent record (Account, Contact, Case, etc.)ShareType— V (Viewer), C (Collaborator), I (Inferred)Visibility— AllUsers, InternalUsers, SharedUsers
Sharing visibility on a file is determined by the ContentDocumentLink, not by the ContentDocument itself. Different links can grant different access levels — a file can be a Viewer to one Account and a Collaborator to another. This flexibility is the main reason the model has the extra layer.
How they relate
Legacy:
┌────────────────┐
│ Attachment │── ParentId ──► Account / Case / Contact / etc.
│ (Body inside)│
└────────────────┘
Modern:
┌──────────────────┐ ┌──────────────────┐
│ ContentDocument │ ◄── │ ContentVersion │ (VersionData inside)
│ (069) │ │ (068) │
└──────────────────┘ └──────────────────┘
▲
│
│ ContentDocumentId
│
┌──────────────────────┐
│ ContentDocumentLink │── LinkedEntityId ──► Account / Case / Contact / etc.
│ (06A) │
└──────────────────────┘
(one row per parent)
How sharing actually works
For modern files, sharing is per-link. A ContentDocument by itself isn't visible to anyone except its owner. Each ContentDocumentLink grants access to one parent record's audience. The chain of access is:
- User has access to the parent record (Account, Case, etc.)
- The parent record has a
ContentDocumentLinkpointing to theContentDocument - The link's
Visibilitysetting determines whether the user can see the file (e.g.,InternalUsersexcludes Community/Experience users) - The link's
ShareTypedetermines whether the user can edit/delete (V = read-only, C = full)
For legacy Attachment, sharing is simpler and tighter: access to the parent record is access to the Attachment, with optional IsPrivate overrides.
Migration and storage implications
- Migration — moving files between orgs requires understanding which model each file uses. Legacy
Attachments migrate asAttachments in the target (or are converted toContentDocuments in flight, if you decide to modernize). Modern files require creating newContentVersions in the target, plusContentDocumentLinks for every parent association. See the file migration guide for the full process. - Counting — counting modern files is more complex than legacy because of the link-based sharing. A naive query of
ContentDocumentLinkdouble-counts files shared with multiple parents. The attachment counting guide covers the right SOQL for each case. - Storage — multiple
ContentVersions perContentDocumentstack up. Version history is one of the silent contributors to file storage growth — version 1 is rarely deleted when version 2 uploads. UseSUM(ContentSize)grouped byIsLatestto see the cost. - Querying — relationship navigation between these objects is non-obvious.
ContentDocument.LatestPublishedVersion.ContentSizeis the path for "latest version's size of this file." Practice with the schema before writing migration queries. - Sharing during migration — preserving sharing visibility across orgs requires migrating
ContentDocumentLink.VisibilityandShareType, not just the file itself. A file migrated without its link metadata lands in the target with default sharing — which is usually wrong.
Working with both file models?
EKE Services tools handle modern ContentDocuments and legacy Attachments in one workflow — counting, migrating, and archiving. No object-by-object scripting, no manual classification, no missing files because they were the wrong type.