GPT-4 dataanalyse

Ill.: GPT-4

Generativ KI utvikles til å håndtere dataanalyse og statistikk. Her er et eksempel.

Vi har et datasett med spørreskjema-data fra noe over 2.000 ungdommer og (noe om) deres  skoleopplevelse i CSV-format (Comma-Separated Values).

Vi laster opp denne fila til en Chat-GPT-app og ber om analyse. Her er det altså ikke snakk om plagiat, men “original” bruk av KI.

(Spoiler alert: Det hjelper om læreren bryr seg.)


The dataset [filename] contains 2271 entries and 39 columns related to various aspects of students’ experiences and feelings about school, personal issues, and mental health. The columns include:

  • Kjønn (Gender), Klasse (Class): Demographic information.
  • Jeg trives på skolen (I enjoy school)Lærerne mine bryr seg om meg (My teachers care about me), etc.: Students’ perceptions and feelings towards school.
  • Familieøkonomi (Family economy), Venner (Friends), Mobbing (Bullying): Personal and social aspects of students’ lives.

+ Several columns concerning students’ psychological well-being and the support they seek for personal problems.

Given this rich dataset, we can explore several analyses. Here are three potential areas of analysis:

  1. School Environment and Student Well-being: We can analyze how different aspects of the school environment (like enjoyment, teacher support, and feelings of belonging) correlate with students’ overall well-being and mental health scores.
  2. Impact of Personal and Social Factors on Academic Performance: Exploring how factors like family economic status, bullying, and social support (friends, family) relate to students’ academic performance and engagement (e.g., feelings about school, time spent on homework).
  3. Predictive Analysis of Mental Health: Using machine learning techniques, we could develop a model to predict students’ mental health based on their school experiences and personal circumstances.

Please let me know which analysis you find most interesting, or if there is another specific aspect you’d like to focus on!

Please do no 2.

For the second analysis on Impact of Personal and Social Factors on Academic Performance, we’ll explore how factors like family economic status, experiences of bullying, and social support relate to students’ feelings about school and their engagement with academic work. This will involve examining correlations and potentially fitting a regression model to predict academic engagement based on these variables.

Let’s start with an exploratory analysis to understand the distribution of these factors and their initial relationships. I’ll also check for any missing values and prepare the data accordingly. Here’s what we’ll look at:

  1. Descriptive statistics and distributions for key variables: Family economy, Friends, Bullying, Enjoyment at school, and Teacher support.
  2. Correlation analysis to see how these variables are related to each other and to students’ feelings about school.

Let’s dive into the exploratory analysis.

Exploratory Analysis Summary

Here are the key insights from the initial exploratory analysis of selected variables related to personal and social factors, and their influence on academic performance:

Missing Data:

  • Familieøkonomi (Family economy): 64 missing values
  • Venner (Friends): 53 missing values
  • Mobbing (Bullying): 56 missing values
  • Jeg trives på skolen (I enjoy school): 299 missing values
  • Lærerne mine bryr seg om meg (My teachers care about me): 311 missing values

Descriptive Statistics:

  • Family Economy ranges from 1 (good) to 5 (poor), with a median of 2, suggesting most students report an average economic status.
  • Friends shows a median of 1, indicating most students feel they have good friendships.
  • Bullying mostly rates at 6 (presumably representing minimal or no bullying).
  • Enjoyment at School and Teacher Support both show lower means (closer to 1), suggesting positive experiences overall but with some variability indicating not all students feel well-supported or enjoy school.

Next Steps:

  1. Handling Missing Data: Given the missing values, particularly in the school experience variables, we should decide whether to impute these missing values or drop them for the analysis. Imputation might be considered if we want to maintain sample size and can logically estimate the missing values.
  2. Correlation Analysis: We’ll examine the relationships between these factors and how they relate to students’ enjoyment and perceived support at school.
  3. Predictive Modeling: Depending on the results of the correlation analysis, we may build a regression model to predict student engagement or feelings about school based on their personal and social circumstances.

Let’s start by handling the missing data. We will use median imputation for numerical variables because it’s robust to outliers. This approach should work given the ordinal nature of most of our variables. Let’s apply this and proceed to the correlation analysis.

Correlation Analysis Summary

The missing values have been successfully imputed, and here are the correlations observed among the selected variables:

  1. Familieøkonomi (Family economy) shows a positive correlation with students enjoying school (0.22) and receiving care from teachers (0.15), indicating that a better family economic situation might be linked to more positive school experiences.
  2. Venner (Friends) has a positive correlation with enjoyment at school (0.17) but a lower correlation with teacher care (0.09). This suggests that having good friendships is associated with enjoying school.
  3. Mobbing (Bullying) has a negative correlation with both enjoyment at school (-0.33) and care from teachers (-0.16), indicating that experiences of bullying are significantly associated with negative perceptions of school.
  4. The strongest correlation is between Jeg trives på skolen (I enjoy school) and Lærerne mine bryr seg om meg (My teachers care about me) (0.39), suggesting that students who feel cared for by their teachers are much more likely to enjoy school.

Given these insights, the next step is to build a predictive model to understand how these factors might predict students’ enjoyment of school. We’ll use a linear regression model for this purpose, considering ‘Jeg trives på skolen’ as the dependent variable and the other factors as independent variables.

Let’s proceed with the model building.

Model Evaluation and Results

The linear regression model aimed to predict students’ enjoyment at school (‘Jeg trives på skolen’) based on their family economic status, quality of friendships, experiences of bullying, and perceived teacher support. Here are the results:

  • Mean Squared Error (MSE): 0.30
  • R² (Coefficient of Determination): 0.29

The R² value of 0.29 suggests that about 29% of the variability in students’ enjoyment at school can be explained by the model, which is moderate. This indicates that while the factors included do impact enjoyment, other unexamined factors might also play significant roles.

Model Coefficients:

  • Familieøkonomi (Family economy): 0.087
  • Venner (Friends): 0.068
  • Mobbing (Bullying): -0.155
  • Lærerne mine bryr seg om meg (My teachers care about me): 0.287

These coefficients provide insights into how each factor influences students’ enjoyment:

  • Positive coefficients for Family economy and Friends suggest that better economic conditions and having good friendships are associated with increased enjoyment at school.
  • A negative coefficient for Bullying confirms that experiences of bullying decrease enjoyment at school.
  • The strongest positive influence is from Teacher support, showing that care from teachers substantially increases students’ enjoyment.

Conclusion and Recommendations

The analysis indicates significant relationships between personal/social factors and students’ school enjoyment. Educational institutions might focus on enhancing teacher-student relationships and creating supportive environments to reduce bullying, thereby improving students’ school experiences.

This model can be refined with additional data or variables (such as extracurricular participation, parental involvement, etc.) to better predict and understand other factors influencing school enjoyment.

If you have any further questions or need additional analysis on different aspects of the dataset, feel free to ask!


Her kunne vi be KI-maskinen forklare de ulike trinnene i detalj, definere fagtermer og formler, stille øvingsspørsmål og evaluere svarene, lage flervalgstester osv.

Må ikke undervisningsformatene nå reformeres? Det er behov for eksperimentelt utviklingsarbeid.