# LunarCrush Beauty Influencer Panel — Holocene Advisors

Custom-built creator panel for Holocene Advisors' beauty research, per the May 21 request from Courtney Babbidge.

## What's in this delivery

Four per-network creator panels (Parquet) + one combined post dump (Parquet).

| File | Rows | Size | Description |
|---|---:|---:|---|
| `holocene_beauty_creators_tiktok.parquet` | 10,000 | 0.9 MB | TikTok creator panel |
| `holocene_beauty_creators_twitter.parquet` | 10,000 | 0.6 MB | X (Twitter) creator panel |
| `holocene_beauty_creators_instagram.parquet` | 8,064 | 0.4 MB | Instagram creator panel |
| `holocene_beauty_creators_youtube.parquet` | 7,805 | 0.8 MB | YouTube creator panel |
| `holocene_beauty_posts.parquet` | 79,166 | 10.6 MB | All beauty posts from the selected creators (last 60 days) |

**Total: 35,869 unique creators, 79,166 unique posts.**

## Tier distribution (per Courtney's spec)

Target distribution: nano 48% / micro 32% / mid 12% / macro 4% / top 2% / mega 2% per network = 10,000.

Achieved:

| Tier (followers) | TikTok | Twitter/X | Instagram | YouTube |
|---|---:|---:|---:|---:|
| Nano (0-20k) | 4,800 | 4,800 | 3,607 | 3,896 |
| Micro (20-80k) | 3,200 | 3,200 | 2,457 | 1,909 |
| Mid (80-250k) | 1,200 | 1,200 | 1,200 | 1,200 |
| Macro (250-500k) | 400 | 400 | 400 | 400 |
| Top (500k-1M) | 200 | 200 | 200 | 200 |
| Mega (1M+) | 200 | 200 | 200 | 200 |
| **Total** | **10,000** | **10,000** | **8,064** | **7,805** |

**Honest disclosure on the Instagram + YouTube shortfalls:** the discoverable beauty creator universe on those platforms — the ones who *post about beauty topics* across the 60-day window we surveyed — is smaller at the nano/micro tier than the strict 48%/32% distribution requires. We delivered every qualifying creator we found in those tiers rather than padding with non-beauty-relevant accounts. On TikTok and X (Twitter), the long tail of nano beauty creators is much deeper — the full distribution hits cleanly.

## Methodology

### Universe construction

Pulled the top-engagement posts for **34 beauty-relevant topics × 60 days** from the LunarCrush data backbone via `/api4/public/topic/{slug}/posts/v1?start=X&end=Y`. Topics included:

**Broad keywords:** beauty, makeup, skincare, haircare, fragrance, cosmetics, perfume, wellness, self-care, lipstick, mascara
**Routines / aesthetics:** skincare-routine, morning-routine, makeup-tutorial
**Premium brands:** charlotte-tilbury, drunk-elephant, tatcha, la-mer, skinceuticals, olaplex
**Mass + DTC brands:** fenty-beauty, fenty, rare-beauty, elf-cosmetics, elf, mac-cosmetics, mac, the-ordinary, cerave, la-roche-posay, paulas-choice
**Retailers:** sephora, ulta, ulta-beauty

Yielded **1,014,688 post records** → deduped to **79,166 unique posts** → **145,563 unique creators** across TikTok, Instagram, X (Twitter), YouTube, plus Reddit (excluded — no follower model).

### Selection within each network

For each creator we computed:
- `max_followers` — the largest follower count observed across their beauty posts (we take max because follower counts grow over time)
- `beauty_post_count` — number of posts they made about beauty topics in the 60-day window
- `beauty_topics_count` — number of distinct beauty topics they posted under (signal of breadth)
- `beauty_total_interactions` — sum of 24h interactions across their beauty posts
- `beauty_relevance_score` = `topics_count × log(post_count+1) × log(interactions+1)`

Creators were bucketed into tier by `max_followers`, sorted within each tier by relevance score descending, and the top N taken to fill Courtney's distribution.

### Post dump

The `holocene_beauty_posts.parquet` contains every unique beauty-topic post made by a selected creator in the 60-day window. **This is beauty-context activity only** — we have not included these creators' non-beauty posts. If the team wants full creator activity (any topic), that's a follow-on pull we can do per-creator via `/creator/{network}/{id}/posts/v1` — just let us know.

## Schemas

### Creator panel (`holocene_beauty_creators_{network}.parquet`)

| Column | Type | Description |
|---|---|---|
| creator_id | string | `network::id` — fully-qualified LunarCrush creator ID |
| creator_name | string | Account handle (e.g. "makeupkitchen.ru") |
| creator_display_name | string | Display name when available |
| creator_avatar | string | URL to avatar (cached) |
| network | string | tiktok / instagram / twitter / youtube |
| followers | int64 | Max follower count observed during the 60d window |
| tier | string | nano / micro / mid / macro / top / mega |
| beauty_post_count | int32 | Number of beauty-topic posts (60d) |
| beauty_total_interactions | int64 | Sum of 24h interactions across beauty posts |
| beauty_topics_count | int16 | Distinct beauty topics they posted under |
| beauty_topics | list<string> | Which beauty topics they appeared under |
| beauty_relevance_score | float64 | Selection priority score |

### Post dump (`holocene_beauty_posts.parquet`)

| Column | Type | Description |
|---|---|---|
| post_id | string | LunarCrush post ID |
| post_type | string | tweet / instagram-post / tiktok-video / youtube-video |
| post_link | string | URL to original post |
| post_created | int64 | Unix epoch seconds (UTC) |
| post_sentiment | float64 | Post sentiment score (LC) |
| post_title | string | Post text / title |
| creator_id | string | Links to creator panel |
| creator_name | string | Denormalized for convenience |
| creator_network | string | Same |
| creator_followers | int64 | Followers at time of post |
| interactions_24h | int64 | Interactions in 24h after post |
| beauty_topic | string | Which beauty topic this post matched |

## Reading the data

Native Python/Pandas/Polars/DuckDB:

```python
import pandas as pd
tiktok = pd.read_parquet("holocene_beauty_creators_tiktok.parquet")
posts = pd.read_parquet("holocene_beauty_posts.parquet")

# Example: top 20 TikTok beauty nanos
nanos = tiktok[tiktok.tier == 'nano'].sort_values('beauty_relevance_score', ascending=False)
print(nanos.head(20)[['creator_name','followers','beauty_topic','beauty_relevance_score']])
```

```sql
-- DuckDB
SELECT tier, COUNT(*), AVG(followers), AVG(beauty_relevance_score)
FROM 'holocene_beauty_creators_tiktok.parquet'
GROUP BY tier;
```

## Open paths if you want more

- **Full creator activity:** non-beauty posts per creator → say the word, we'll pull `/creator/{network}/{id}/posts/v1` for the panel.
- **Larger Instagram / YouTube panel:** if the tier-distribution shortfall matters, we can (a) widen the topic universe further to surface more nano beauty creators on those networks, or (b) cross-match to external influencer-marketing-platform lists. The current panel is "every qualifying creator we found posting about beauty topics in the last 60 days" — that's the honest universe.
- **Custom beauty-search aggregation:** LunarCrush lets us create persistent custom search aggregations (`/api4/public/searches/create`) with inclusion / exclusion logic. If Holocene has a specific definition of "beauty" they want us to operationalize, we can stand up a custom feed and ship updates on whatever cadence works.
- **Different time window:** 60 days was a pragmatic call. We can extend back to multi-year history if useful.

Reply to Joe at joe@lunarcrush.com — happy to iterate.

Generated 2026-05-22.
