Discussion Forum

This is a discussion forum where you can ask questions and chat casually about everything. You can post pictures here.

3 comments

  1. Anonymous
    Hello everyone, i need your help to resolve that code after many times.
    what i want to do is to write a custom function numeric_stats that takes a data frame and returns the minimum, maximum, mean, and median for all numeric columns. Use this function and sapply to create a data frame numeric_diamonds_stats for groups by 'cut' and 'color' in the diamonds dataset.
    it has to be in a specific structure as i added.
    and i have to use the function group_by and summarise each.
    Thank you! 🙂
    1. Anonymous
      # Load necessary libraries
      library(dplyr)

      # Define the function to be applied to each column
      your_function <- function(x) {
      # Return the average if the column contains numeric data, otherwise return NA
      if(is.numeric(x)) {
      return(mean(x, na.rm = TRUE))
      } else {
      return(NA)
      }
      }

      # Define the dummy dataset
      employee_data <- data.frame(
      employee_id = 1:100,
      department = sample(c("HR", "Finance", "IT", "Marketing"), 100, replace = TRUE),
      age = sample(22:60, 100, replace = TRUE),
      salary = runif(100, min = 30000, max = 100000)
      )

      # Grouping variables
      grouping_vars <- c("department")

      # Columns to exclude
      columns_to_exclude <- c("employee_id")

      # Use the dummy dataset
      employee_data %>%
      group_by(across(all_of(grouping_vars))) %>%
      summarize(across(
      .cols = -all_of(columns_to_exclude), # Exclude specified columns
      .fns = list(your_function),
      .names = "avg_{.col}"
      ), .groups = "drop")

  2. Anonymous
    I found this helpful