High Hopes: A Data-Driven Deep Dive into Washington’s Cannabis Market (2024

🌿 Puff, Puff, Plot: Cannabis Sales in WA

Since the legalization of recreational cannabis in Washington, the Washington State Liquor and Cannabis Board (WSLCB) has tracked every bud, edible, and pre-roll in the market. What better time to explore a sample of these Jan 2024– Feb 2025 transactions than the high holiday of #TidyTuesday?

This 420-day dataset sample is cleaned and anonymized. It contains 50,000 sales transactions, covering January 2024 to March 2025. On aggregate and use of ‘floor_date()’ during the data processing for #TidyTuesday format, the time range ends up “2024-01-01” to “2025-02-01”. We still employ the lubridate package and use ‘floor_date()’ throughout this example as a debugging step to keep the date column exactly how it should be for the pipeline (and for sanity purposes).

We’ll take a quick tour through:

🏪 Top-grossing licensees

🌱 Strain-level text analysis (with TF-IDF!)

📈 Time series decomposition & forecasting by product category and strain type - an “indy” workspace - a call to action for you to try #tidyverse and #tidytext sets of functions on your own.

Let’s blaze a trail through the data we’ll call, “wa_canna”.

# Load the data
dataset_url <- "https://raw.githubusercontent.com/Ensign-Analytics/tidytuesday/wa-cannabis/data/curated/wa-cannabis/wa_cannabis_sample.csv"

wa_canna <- readr::read_csv(dataset_url, show_col_types = FALSE)

💸 Who’s Making Green? (Top 5 Retailers)

wa_canna %>%
  group_by(licensee_dba) %>%
  summarise(total_sales = sum(retail_sales_amt, na.rm = TRUE)) %>%
  arrange(desc(total_sales)) %>%
  slice_head(n = 5) %>%
  ggplot(aes(reorder(licensee_dba, total_sales), total_sales)) +
  geom_col(fill = "#6A994E") +
  scale_y_continuous(labels = dollar_format()) +
  coord_flip() +
  labs(title = "Top 5 Licensees in Washington by Retail Sales", x = NULL, y = "Sales ($)") +
  theme_minimal(base_size = 9)

# Optional: Save the plot
#ggsave("plots/top_licensees_sales.png", width = 8, height = 5)

🌿 What’s in a Name? (TF-IDF for Strain Names)

# Pull the top 5 licensees by total retail sales
top_5_licensees <- wa_canna %>%
  group_by(licensee_dba) %>%
  summarise(total_sales = sum(retail_sales_amt, na.rm = TRUE)) %>%
  slice_max(total_sales, n = 5) %>%
  pull(licensee_dba)

# Filter for just those licensees
filtered_wa <- wa_canna %>%
  filter(licensee_dba %in% top_5_licensees)

# Tokenize and clean text from strain_name
tfidf_data <- filtered_wa %>%
  unnest_tokens(word, strain_name) %>%
  filter(!str_detect(word, "^\\d+$")) %>%            # Remove pure numbers
  filter(!word %in% stop_words$word) %>%             # Remove common stop words
  count(licensee_dba, word, sort = TRUE) %>%
  bind_tf_idf(word, licensee_dba, n)

# Get top 5 terms by TF-IDF per licensee
top_terms <- tfidf_data %>%
  group_by(licensee_dba) %>%
  slice_max(tf_idf, n = 5, with_ties = FALSE) %>%
  ungroup()

# Plot as faceted bar chart
ggplot(top_terms, aes(x = fct_reorder(word, tf_idf), y = tf_idf, fill = licensee_dba)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~ licensee_dba, scales = "free_y") +
  coord_flip() +
  labs(title = "Top TF-IDF Terms of Cannabis Strain Names at Top 5 Washington Licensees",
       x = "Term", y = "TF-IDF Score") +
  theme_minimal(base_size = 8)

# Optional: Save the plot
#ggsave("plots/tfidf_terms.png", width = 8, height = 5)

🚬 Curious about product names? Go back and supplant “strain_name” with “product_name” and add scripts to remove special characters (-*&^%$#@?) and numbers and units (e.g. 1oz) using regex for some dank tf-idf analysis.

Because I’m kind …

# Clean + tokenize product_name
tfidf_data <- filtered_wa %>%
  mutate(product_name = str_to_lower(product_name)) %>%
  mutate(product_name = str_replace_all(product_name, "\\d+(\\.\\d+)?\\s*(g|gram|grams|oz|ounce|ounces|mg|lb|lbs)", "")) %>% # remove units like 1g, 3.5g, etc.
  mutate(product_name = str_replace_all(product_name, "[^a-zA-Z\\s]", "")) %>%  # remove special characters
  unnest_tokens(word, product_name) %>%
  filter(!word %in% stop_words$word) %>%
  filter(!word %in% stopwords::stopwords("en")) %>%  # extended stopword filter
  filter(str_detect(word, "[a-z]")) %>%              # keep only actual words
  count(licensee_dba, word, sort = TRUE) %>%
  bind_tf_idf(word, licensee_dba, n)

# Grab top 5 TF-IDF terms for each licensee
top_terms <- tfidf_data %>%
  group_by(licensee_dba) %>%
  slice_max(tf_idf, n = 5, with_ties = FALSE) %>%
  ungroup()

# Faceted bar plot
ggplot(top_terms, aes(x = fct_reorder(word, tf_idf), y = tf_idf, fill = licensee_dba)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~ licensee_dba, scales = "free_y") +
  coord_flip() +
  labs(title = "Top TF-IDF Terms in Product Names (Cleaned)",
       subtitle = "Special characters, units, and numbers removed",
       x = "Term", y = "TF-IDF Score") +
  theme_minimal(base_size = 7)

# Optional: Save the plot
# ggsave("plots/tfidf_cleaned_terms.png", width = 8, height = 5)

📅 Monthly Sales: A Time Series Buzz

# Create time series of total sales by month
monthly_sales <- wa_canna %>%
  mutate(month = floor_date(sale_date_month, unit = "month")) %>%
  group_by(month) %>%
  summarise(retail_sales_amt = sum(retail_sales_amt, na.rm = TRUE)) %>%
  ungroup()

# Convert to tsibble (time-aware tibble)
monthly_ts <- monthly_sales %>%
  as_tsibble(index = month)

# Fill any missing months (explicitly handle gaps)
monthly_ts_filled <- monthly_ts %>%
  fill_gaps(.full = TRUE) %>%
  mutate(retail_sales_amt = ifelse(is.na(retail_sales_amt), 0, retail_sales_amt))

# STL decomposition model
stl_model <- monthly_ts_filled %>%
  model(STL(retail_sales_amt))

# Extract components and plot
components <- stl_model %>%
  components()

autoplot(components) +
  labs(title = "STL Decomposition of Monthly Cannabis Retail Sales")

# Optional: Save the plot
#ggsave("plots/stl_decomposition.png", width = 8, height = 5)

🧪 Decompose by Inventory Category

# Aggregate by month + category/strain
# By Product Category
category_ts <- wa_canna %>%
 mutate(month = yearmonth(floor_date(sale_date_month, unit = "month"))) %>%
  group_by(month, inventory_type_category) %>%
  summarise(retail_sales_amt = sum(retail_sales_amt, na.rm = TRUE), .groups = "drop") %>%
  as_tsibble(index = month, key = inventory_type_category) %>%
  fill_gaps(retail_sales_amt = 0)

#Seasonal decomposition
# STL by Category
stl_category <- category_ts %>%
  model(STL(retail_sales_amt ~ trend(window = 7) + season(window = "periodic")))

# Debugging step
glimpse(components(stl_category))

## Rows: 99
## Columns: 7
## Key: inventory_type_category, .model [8]
## : retail_sales_amt = trend + remainder
## $ inventory_type_category <chr> "Concentrates", "Concentrates", "Concentrates"…
## $ .model                  <chr> "STL(retail_sales_amt ~ trend(window = 7) + se…
## $ month                   <mth> 2024 Jan, 2024 Feb, 2024 Mar, 2024 Apr, 2024 M…
## $ retail_sales_amt        <dbl> 135407.19, 112747.37, 106993.44, 125952.80, 12…
## $ trend                   <dbl> 120210.18, 121306.44, 122402.70, 125730.99, 12…
## $ remainder               <dbl> 15197.0070, -8559.0702, -15409.2575, 221.8108,…
## $ season_adjust           <dbl> 135407.19, 112747.37, 106993.44, 125952.80, 12…

# Decomposition Components
category_components <- components(stl_category) %>%
  mutate(season_adjust = retail_sales_amt - remainder)

removed_cats <- c( "Miscellaneous", "Waste/Other Materials", "Unclassified", "Mixes", "Topicals" )

# Drop model for legend clarity
category_long <- components(stl_category) %>%
  filter(!inventory_type_category %in% removed_cats) %>%
  pivot_longer(cols = c(trend, remainder, season_adjust), names_to = "component", values_to = "value")

# Remove .model from aes() inside autoplot
ggplot(category_long, aes(x = month, y = value, group = interaction(component, inventory_type_category))) +
  geom_line(color = "#198754", linewidth = 0.8, lineend = "round") +
  facet_grid(component ~ inventory_type_category, scales = "free_y") +
  scale_y_continuous(labels = dollar_format()) +
  labs(
    title = "STL Components by Product Category",
    x = NULL, y = "Retail Sales"
  ) +
  theme_minimal(base_size = 7) +
  theme(strip.text = element_text(face = "bold"),
        axis.text.x = element_text(angle = 45, hjust = 1))

# Optional: Save the plot
#ggsave("plots/stl_decomposition_category.png", width = 8, height = 5)

🧪 Decompose by Strain Type

# By Strain Type
strain_ts <- wa_canna %>%
  mutate(month = yearmonth(floor_date(sale_date_month, unit = "month"))) %>%
  group_by(month, strain_type) %>%
  summarise(retail_sales_amt = sum(retail_sales_amt, na.rm = TRUE), .groups = "drop") %>%
  as_tsibble(index = month, key = strain_type) %>%
  fill_gaps(retail_sales_amt = 0) %>%
  na.omit()

# Refining tool! 
# Use to focus only on top 5 categories or strains
# top_strains <- wa_filtered %>%
#   count(strain_type, sort = TRUE) %>%
#   slice_head(n = 5) %>%
#   pull(strain_type)
# 
# strain_ts <- strain_ts %>% filter(strain_type %in% top_strains)

# STL by Strain
stl_strain <- strain_ts %>%
  model(STL(retail_sales_amt ~ trend(window = 7) + season(window = "periodic")))

strain_components <- components(stl_strain)

strain_long <- strain_components %>%
  pivot_longer(cols = c(trend, remainder, season_adjust), 
               names_to = "component", 
               values_to = "value") 

ggplot(strain_long, aes(x = month, y = value, group = interaction(component, strain_type))) +
  geom_line(color = "#6A994E", linewidth = 0.8) +
  facet_grid(component ~ strain_type, scales = "free_y") +
  scale_y_continuous(labels = dollar_format()) +
  labs(
    title = "STL Decomposition by Strain Type",
    x = NULL, y = "Retail Sales"
  ) +
  theme_minimal(base_size = 9) +
  theme(strip.text = element_text(face = "bold"),
        axis.text.x = element_text(angle = 45, hjust = 1))

# Optional: Save the plot
#ggsave("plots/stl_decomposition_strain_type.png", width = 8, height = 5)

🌱 Forecast the Future: Category Edition

# Optional: Focus on Top 5 Categories
# top_categories <- wa_filtered %>%
#   count(inventory_type_category, wt = retail_sales_amt, sort = TRUE) %>%
#   slice_head(n = 5) %>%
#   pull(inventory_type_category)
# 
# category_ts <- category_ts %>%
#   filter(inventory_type_category %in% top_categories)

# For the full sample dataset
category_ts_filtered <- category_ts %>%
  group_by(inventory_type_category) %>%
  filter(n() >= 2) %>%
  ungroup()

category_model <- category_ts_filtered %>%
  model(
    ets = ETS(retail_sales_amt))

🔮 These sales are highly seasonal. Get ready to stock up every 4/20 and summer solstice.

🌱 Forecast the Next 6 Months for each Inventory Type

# IMPORTANT! Correct the Date column format. Use category_ts_filtered instead of creating a new object. For this example, we create a new object with corrected date column. 

category_forecasts <- forecast(category_model, h = "6 months")

autoplot(category_forecasts, category_ts_filtered) +
  labs(
    title = "Forecasted Cannabis Sales by Category",
    subtitle = "Model: ETS | Horizon: 6 Months",
    x = "Month",
    y = "Retail Sales (USD)"
  ) +
  scale_y_continuous(labels = dollar_format()) +
  facet_wrap(~inventory_type_category, scales = "free_y") +
  theme_minimal(base_size = 10) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    legend.position = "none"
  )

# Optional: Save the plot
#ggsave("plots/fable_forecast_category.png", width = 8, height = 5)

Roll up your own take — now it’s your sesh. Repeat above steps to forecast strain name, product name, or inventory sub-category.

🌿 Bonus Nugget: Passing the TF-IDF n-grams (multi-word strain names).

In the above TF-IDF on strain name facetted plot, I noticed “Dream” and thought where’s “Blue Dream”? For this reason, I checked n-grams = 3.5 (hehe) to get longer strain names for better strain identification and quality output. Let’s see if “Super Silver Haze” shows up in the plot? You can try n = 5 to see how changing ‘n’ matters.

# Load stop words
data("stop_words")

# 🔥 Bonus TF-IDF: Bigrams from strain_name
bigram_tfidf <- filtered_wa %>%
  # Create bigrams from strain_name
  unnest_tokens(bigram, strain_name, token = "ngrams", n = 3.5) %>% # when you pass a decimal, it gets silently coerced to an integer under the hood (e.g., floor(n) or as.integer(n)). It's not an issue to put 3.5, just know n = 3.
  
  # Remove bigrams with numbers or units (like 1g, 1oz)
  filter(!str_detect(bigram, "\\b\\d+(mg|g|oz|lb)?\\b")) %>%
  
  # Remove special characters
  mutate(bigram = str_replace_all(bigram, "[^a-zA-Z\\s]", "")) %>%
  
  # Split and remove stopwords from each part of bigram
  separate(bigram, into = c("word1", "word2"), sep = " ", fill = "right") %>%
  filter(!word1 %in% stop_words$word,
         !word2 %in% stop_words$word) %>%
  
  # Recombine cleaned bigram
  unite(bigram, word1, word2, sep = " ") %>%
  count(licensee_dba, bigram, sort = TRUE) %>%
  bind_tf_idf(bigram, licensee_dba, n)

## Warning: Expected 2 pieces. Additional pieces discarded in 529 rows [1, 2, 3, 4, 5, 6,
## 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

# Get top 5 bigrams per licensee
top_bigrams <- bigram_tfidf %>%
  group_by(licensee_dba) %>%
  slice_max(tf_idf, n = 5, with_ties = FALSE) %>%
  ungroup()

# Plot results 💥
ggplot(top_bigrams, aes(x = fct_reorder(bigram, tf_idf), y = tf_idf, fill = licensee_dba)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~ licensee_dba, scales = "free_y") +
  coord_flip() +
  labs(
    title = "Top TF-IDF Bigrams from Strain Names",
    subtitle = "Multi-word cannabis strains that stand out per top licensee",
    x = "Strain Bigram", y = "TF-IDF Score"
  ) +
  theme_minimal(base_size = 10)

# Optional save
#ggsave("plots/stl_strain_type_decomposition.png", width = 10, height = 6)

“For the Gram of It: Let’s build a comparative TF-IDF table of top terms by n-gram size for strain_name — specifically for the Top 5 WA cannabis licensees.

# Helper function to compute tf-idf for any n-gram
get_tfidf_by_ngram <- function(data, n) {
  data %>%
    mutate(strain_name = str_to_lower(strain_name)) %>%
    unnest_tokens(ngram, strain_name, token = "ngrams", n = n) %>%
    filter(!str_detect(ngram, "^\\d+$")) %>% # remove pure numbers
    filter(!str_detect(ngram, "\\d")) %>%    # remove mixed digits (like 1g, 3.5g)
    filter(!str_detect(ngram, "\\b(g|mg|oz|lb|gram|grams|ml|pack)\\b")) %>%
    filter(!ngram %in% stop_words$word) %>%
    count(licensee_dba, ngram, sort = TRUE) %>%
    bind_tf_idf(ngram, licensee_dba, n) %>%
    group_by(licensee_dba) %>%
    slice_max(tf_idf, n = 5, with_ties = FALSE) %>%
    ungroup() %>%
    mutate(n_gram = paste0(n, "-gram"))
}

# Generate tf-idf results for 1, 2, 3-grams
unigrams <- get_tfidf_by_ngram(filtered_wa, n = 1)
bigrams  <- get_tfidf_by_ngram(filtered_wa, n = 2)
trigrams <- get_tfidf_by_ngram(filtered_wa, n = 3)

# Combine
ngram_results <- bind_rows(unigrams, bigrams, trigrams)

# Make a comparison table
ngram_table <- ngram_results %>%
  select(licensee_dba, n_gram, ngram, tf_idf) %>%
  arrange(licensee_dba, n_gram, desc(tf_idf)) %>%
  group_by(licensee_dba, n_gram) %>%
  mutate(rank = row_number()) %>%
  pivot_wider(names_from = n_gram, values_from = ngram) %>%
  select(licensee_dba, rank, `1-gram`, `2-gram`, `3-gram`)

# Show the table
print(ngram_table, n = 50)

## # A tibble: 75 × 5
## # Groups:   licensee_dba [5]
##    licensee_dba                  rank `1-gram`         `2-gram`         `3-gram`
##    <chr>                        <int> <chr>            <chr>            <chr>   
##  1 FORBIDDEN FARMS                  1 <NA>             dirty white girl <NA>    
##  2 FORBIDDEN FARMS                  2 <NA>             house blend ind… <NA>    
##  3 FORBIDDEN FARMS                  3 <NA>             pound cake gela… <NA>    
##  4 FORBIDDEN FARMS                  4 <NA>             face off og      <NA>    
##  5 FORBIDDEN FARMS                  1 <NA>             <NA>             mandari…
##  6 FORBIDDEN FARMS                  2 <NA>             <NA>             blueber…
##  7 FORBIDDEN FARMS                  3 <NA>             <NA>             mandari…
##  8 FORBIDDEN FARMS                  4 <NA>             <NA>             orange …
##  9 FORBIDDEN FARMS                  5 <NA>             <NA>             pink ce…
## 10 FORBIDDEN FARMS                  6 <NA>             <NA>             blueber…
## 11 FORBIDDEN FARMS                  1 <NA>             <NA>             <NA>    
## 12 FORBIDDEN FARMS                  2 <NA>             <NA>             <NA>    
## 13 FORBIDDEN FARMS                  3 <NA>             <NA>             <NA>    
## 14 FORBIDDEN FARMS                  1 <NA>             <NA>             <NA>    
## 15 FORBIDDEN FARMS                  1 <NA>             <NA>             <NA>    
## 16 LIFTED CANNABIS                  1 south florida og <NA>             <NA>    
## 17 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 18 LIFTED CANNABIS                  2 <NA>             <NA>             <NA>    
## 19 LIFTED CANNABIS                  3 <NA>             <NA>             <NA>    
## 20 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 21 LIFTED CANNABIS                  2 <NA>             <NA>             <NA>    
## 22 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 23 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 24 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 25 LIFTED CANNABIS                  2 <NA>             <NA>             <NA>    
## 26 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 27 LIFTED CANNABIS                  2 <NA>             <NA>             <NA>    
## 28 LIFTED CANNABIS                  3 <NA>             <NA>             <NA>    
## 29 LIFTED CANNABIS                  4 <NA>             <NA>             <NA>    
## 30 LIFTED CANNABIS                  1 <NA>             <NA>             <NA>    
## 31 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 32 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 33 NORTHWEST CANNABIS SOLUTIONS     2 <NA>             <NA>             <NA>    
## 34 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 35 NORTHWEST CANNABIS SOLUTIONS     2 <NA>             <NA>             <NA>    
## 36 NORTHWEST CANNABIS SOLUTIONS     3 <NA>             <NA>             <NA>    
## 37 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 38 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             cream cake bx    <NA>    
## 39 NORTHWEST CANNABIS SOLUTIONS     2 <NA>             do si dos        <NA>    
## 40 NORTHWEST CANNABIS SOLUTIONS     3 <NA>             gary peyton ore… <NA>    
## 41 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 42 NORTHWEST CANNABIS SOLUTIONS     2 <NA>             <NA>             <NA>    
## 43 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 44 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 45 NORTHWEST CANNABIS SOLUTIONS     1 <NA>             <NA>             <NA>    
## 46 THE HERBERY                      1 <NA>             <NA>             <NA>    
## 47 THE HERBERY                      2 <NA>             <NA>             <NA>    
## 48 THE HERBERY                      1 <NA>             <NA>             <NA>    
## 49 THE HERBERY                      2 <NA>             <NA>             <NA>    
## 50 THE HERBERY                      3 <NA>             <NA>             <NA>    
## # ℹ 25 more rows

📝 Closing Thoughts

The Washington cannabis market is a hotbox of trends and data-rich opportunities. Whether you’re curious about strain branding, retailer dynamics, or market timing — even this small sample dataset offers plenty of insight for kind-folks to roll up.

Stay lit 🔥 — and don’t forget: legalize data analysis.

Just like you wouldn’t buy a pre-roll without knowing the strain, you shouldn’t forecast retail activity without modeling uncertainty. Enter the Prophet. It doesn’t predict your zodiac vibes, but it does give probabilistic forecasts that adapt to trend shifts and seasonality — perfect for a market as dynamic as weed.

4/20 Encore: Bayesian Time Series Forecast using Prophet

# Identify top 2 selling product categories
top_categories <- wa_canna %>%
  count(inventory_type_category, wt = retail_sales_amt, sort = TRUE) %>%
  slice_head(n = 2) %>%
  pull(inventory_type_category)

# Filter time series for just those categories
category_ts_bayes <- category_ts %>%
  filter(inventory_type_category %in% top_categories)

# Clean for valid time series modeling (at least 2 non-zero entries)
category_ts_bayes_clean <- category_ts_bayes %>%
  group_by(inventory_type_category) %>%
  filter(sum(!is.na(retail_sales_amt) & retail_sales_amt > 0) >= 2) %>%
  ungroup()

# Fit NNETAR model (Bayesian-like neural net autoregression)
category_bayes_model <- category_ts_bayes_clean %>%
  model(
    bayes = NNETAR(retail_sales_amt)
  )

# Forecast next 6 months with confidence intervals
category_forecasts <- forecast(category_bayes_model, h = "6 months", level = c(80, 95))

# Check if confidence levels exist in forecast
if (".level" %in% colnames(category_forecasts)) {
  
  # Convert forecast to wider format for plotting ribbons
  plot_data <- category_forecasts %>%
    filter(.model == "bayes") %>%   # Optional if other models exist
    as_tibble() %>%
    filter(.level %in% c(80, 95)) %>%
    select(month, inventory_type_category, .mean, .lower, .upper, .level) %>%
    pivot_wider(
      names_from = .level,
      values_from = c(.lower, .upper),
      names_glue = "{.value}_{.level}"
    )

  # Bring in historical data
  hist_data <- category_ts_bayes_clean %>%
    select(month, inventory_type_category, retail_sales_amt)

  # Plot forecast with ribbons
  ggplot(plot_data, aes(x = month, y = .mean)) +
    geom_ribbon(aes(ymin = .lower_95, ymax = .upper_95), fill = "#B0E0E6", alpha = 0.3) +
    geom_ribbon(aes(ymin = .lower_80, ymax = .upper_80), fill = "#ADD8E6", alpha = 0.5) +
    geom_line(color = "#1f77b4", size = 1) +
    geom_line(data = hist_data, aes(x = month, y = retail_sales_amt), color = "darkgreen") +
    facet_wrap(~inventory_type_category, scales = "free_y") +
    scale_y_continuous(labels = dollar_format()) +
    labs(
      title = "Bayesian Forecasts of Cannabis Sales (Top 2 Categories)",
      subtitle = "Forecasts with 80% and 95% Confidence Intervals",
      x = "Month", y = "Retail Sales (USD)"
    ) +
    theme_minimal(base_size = 9)
  
} else {
  # Fallback for models that do NOT support CIs (e.g., NNETAR in some settings)
  autoplot(category_forecasts, category_ts_bayes_clean) +
    facet_wrap(~inventory_type_category, scales = "free_y") +
    labs(
      title = "Bayesian Forecasts of Cannabis Sales (Top 2 Categories)",
      subtitle = "Confidence intervals not available from this model",
      x = "Month", y = "Retail Sales (USD)"
    ) +
    scale_y_continuous(labels = dollar_format()) +
    theme_minimal(base_size = 9) +
    theme(legend.position = "none")
}

🌿 Ultimate Cannabis Vibe Recap:

Forecasting the future of Washington weed, are you? We’ve got high hopes and higher confidence intervals. With neural nets peeking six months ahead, it’s like your favorite strain — smooth, adaptive, and slightly unpredictable.😎🔥

Session Information

sessionInfo()

## R version 4.5.0 (2025-04-11 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
##   LAPACK version 3.12.1
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: America/Los_Angeles
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] ggh4x_0.3.0         stopwords_2.3       magrittr_2.0.3     
##  [4] fable.prophet_0.1.0 Rcpp_1.0.14         fable_0.4.1        
##  [7] feasts_0.4.1        fabletools_0.5.0    tsibble_1.1.6      
## [10] scales_1.3.0        tidytext_0.4.2      lubridate_1.9.4    
## [13] forcats_1.0.0       stringr_1.5.1       dplyr_1.1.4        
## [16] purrr_1.0.4         readr_2.1.5         tidyr_1.3.1        
## [19] tibble_3.2.1        ggplot2_3.5.2       tidyverse_2.0.0    
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6         anytime_0.3.11       xfun_0.52           
##  [4] bslib_0.9.0          lattice_0.22-6       tzdb_0.5.0          
##  [7] vctrs_0.6.5          tools_4.5.0          generics_0.1.3      
## [10] curl_6.2.2           parallel_4.5.0       janeaustenr_1.0.0   
## [13] pkgconfig_2.0.3      tokenizers_0.3.0     Matrix_1.7-3        
## [16] distributional_0.5.0 lifecycle_1.0.4      farver_2.1.2        
## [19] compiler_4.5.0       munsell_0.5.1        SnowballC_0.7.1     
## [22] htmltools_0.5.8.1    sass_0.4.9           yaml_2.3.10         
## [25] pillar_1.10.2        crayon_1.5.3         jquerylib_0.1.4     
## [28] ellipsis_0.3.2       cachem_1.1.0         tidyselect_1.2.1    
## [31] digest_0.6.37        stringi_1.8.7        labeling_0.4.3      
## [34] fastmap_1.2.0        grid_4.5.0           colorspace_2.1-1    
## [37] cli_3.6.4            utf8_1.2.4           withr_3.0.2         
## [40] bit64_4.6.0-1        timechange_0.3.0     rmarkdown_2.29      
## [43] nnet_7.3-20          bit_4.6.0            progressr_0.15.1    
## [46] hms_1.1.3            evaluate_1.0.3       knitr_1.50          
## [49] ggdist_3.3.2         rlang_1.1.5          glue_1.8.0          
## [52] rstudioapi_0.17.1    vroom_1.6.5          jsonlite_2.0.0      
## [55] R6_2.6.1

High Hopes: A Data-Driven Deep Dive into Washington’s Cannabis Market (2024–2025)

Yourweeddata.com

4-19-2025 • Got a minute?