Acting as Teaching Assistant in Optimizing Apache Spark™ on Databricks Course

Jacek Laskowski
4 min readDec 4, 2022

--

I joined Databricks as a Lead Instructor under Brooke Wenig.

We knew each other for a long time so when she asked me one day if I’d been interested in teaching courses in Databricks Academy and keeping my “employment independence”, I thought no minute and accepted the offer.

Apache Spark™ Programming with Databricks (2 days)

I was particularly well-prepared for Apache Spark™ Programming with Databricks (2 days) and got 2 classes (private and public) almost instantly. Brooke truly believed I could do wonders! 😏

The two Apache Spark™ Programming with Databricks courses went fine knowledge-wise yet the scores weren’t as impressive as we hoped. Also, Optimizing Apache Spark™ on Databricks would come next soon and since it’s different how it is taught we decided to give me some exposure before my class.

Teaching Assistant to Zoltan’s Optimizing Apache Spark™ on Databricks

As part of improving my teaching excellence we agreed to let me help other experienced Databricks instructors as a teaching assistant (TA). Soon, Zoltan C. Toth was giving an Optimizing Apache Spark™ on Databricks class so it was a perfect opportunity to watch him “live on stage” and become a teaching pro.

And he did teach me a lot!

Slack

First and foremost, he showed me how easy it is to create a new Slack workspace with three channels to use for class communication:

  • General
  • Questions
  • Resources

I can imagine how surprising my surprise can be for you but for some reason creating a new Slack workspace seemed way too much for my communication needs and didn’t consider it whatsoever. I was wrong and am going to use Slack whenever possible.

Emoji

Zoltan used emoji a lot to foster communication:

  • For voting 👍/ 👎
  • 👋 to welcome people

I liked it much. Slack (and GitHub and perhaps many other collaboration tools) makes using emoji a breeze:

  1. Typing an emoji using the colon
  2. “Supporting” an emoji by simply clicking it

While I was typing “I really wished there was a way to use colon to type emoji at Medium so I could use ❤️ here” the heart emoji showed up! 🔥

Introduction

Another surprise of how Zoltan led the class was how effective introducing 10+ people could be.

Give them 10 mins to introduce themselves in #general channel.

Not only does it allow the audience to do something (so they don’t only have to listen) but can also be a break for an instructor from speaking and letting him rest a little bit.

A bonus is to show the students ways to introduce yourself, e.g.:

Hi All!

  • My name is Jacek, I’m Polish living in Warsaw, Poland
  • I’m an IT freelancer at myself (company)
  • I’ve been using Spark for quite some time already
  • My Databricks experience is…
  • I have completed this and that course
  • My programming language is Scala (mainly), with some SQL, Python and Java (in that order)
  • I’d like to learn more about…
  • Fun fact(s): learning French, love stretching (aiming at so-called “Van Damme split”), love calisthenics (aiming at handstand), play basketball and swim often
  • Nice to meet you! 👍

Skitch

I think I’d seen Skitch once or twice but didn’t really use it as a sketching tool. I seemed wrong given the tool’s goal:

Get your point across with fewer words using annotation, shapes and sketches, so that your ideas become reality faster.

That’s exactly what you really need for teaching, isn’t it? I’m going to use it more often for sure.

Breaks

That’s very important to keep teaching pace as effective as it is required to teach all the modules yet let people relax early and often so they can “survive” the whole class (say 2 days).

It is not easy and I was quite bad at it since I’m very talkative and can talk about Apache Spark for long enough to get you tired (even if still excited).

Zoltan’s idea is to split a class into 1-hour blocks with a 12-minute break each.

You may wonder why 12 minutes for a break. As Zoltan explained, it’s longer than 10 minutes that didn’t always work yet not too long (like 15 minutes) so the ratio of teaching vs resting is well balanced.

Each day started at 9am and ended at 4pm to let students have an extra 1 hour to work alone on exercises.

Lunch breaks were at 1pm and took 1 hour. When there are any timezone differences (think of the UK to Poland) it becomes 12pm their time. Not bad for a lunch break.

My Teaching

Zoltan made sure I had opportunity to hone my teaching skills and let me teach the following modules:

  • Cost-Based Optimization (CBO) that was an extra topic beyond the courseware
  • Bucketing
  • Z-Order (Delta Lake itself is not part of the courseware)

At the end of the class, the student were asked to fill out an evaluation form where they said for “Q7 — What was your favorite part of the class?”

almost everything but Jacek Laskowski’s explanations were really good.

I’m sure you know how I felt having seen it.

Thank you Zoltan and Brooke! 👏👏👏

Tomorrow, Monday, Dec 5th is going to be my exam day. It’s me to teach an Optimizing Apache Spark™ on Databricks course and I’ll see how much I managed to remember and use all the above tricks in practise. Wish me luck! 🍀

--

--

Jacek Laskowski

Freelance Data(bricks) Engineer | #ApacheSpark #DeltaLake #Databricks #ApacheKafka #KafkaStreams | Java Champion | @theASF | #DatabricksBeacons