Python

simple webscraper for last.fm with BeautifulSoup

Written by  on September 29, 2020

tl;dr:
Simple webscraper with Python and BeautifulSoup for one user’s favorite tracks (‘loved songs’) at last.fm.
Repository: github

full text:
Looks like last.fm is shutting down its services (one feature at at time, lol). They started this process more or less ten years ago.
I’ve realized that I would miss my curated list of favorite tracks and I am also very bad at remembering, so … let’s automate the process of grabbing that information from their public page. I know, they offer a REST API, but I wanted to use once BS4.
Since I had somehow two free hours fourteen days ago, I went full-speed to some tutorials, played with the get-requests and how to parse. And then spent the last minutes parsing the received “artist+track”-string into something usable. Alltogether four hours were spent and I am amazed by the result. Of course, by leveraging three quite powerful libraries (beautifulsoup4, requests, lxml) and skipping TDD (;) I’ve reached the goal quite fast. And since the script works (my 1500 loved songs are scraped in less than 60 seconds), I will also not spend additional effort to make it “pretty”.

“[Full Day Workshop] Kubeflow + BERT + GPU + TensorFlow + Keras + SageMaker”

Written by  on September 27, 2020

I’ve just spent the last eight hours attending a workshop about #SageMaker, #AutoPilot, #BERT, #Athena, #TensorFlow, #Spark, [..] and I am feeling a bit light-headed.
Of course, the talk and guidance given by @AntjeBarth and @ChrisFregly was really well prepared, but if you’re just a ML-beginner (like me) and if then over 9000 of new technologies drop, you have to work hard to follow the fast paced event.
Of course, I started my ML-journey in the summer of 2019, but it was more focussed on image-processing, not #NLP. I worked before with #Python, #Jupyter notebooks, TensorFlow and #Keras, but that whole SageMaker-thing was new to me.
And I see the potential: instead of running the stuff locally, you prepare, prototype and run your ML-app inside Amazon’s infrastructure. And that AutoPilot, which helps to quickstart the prototyping by trying several preprocessing-steps and models for you on your data, looks promising. Will definitely give it a second look.
Notes can be found at: https://github.com/marcelpetrick/KubeFlow_BERT_GPU_TensorFlow_Keras_SageMaker_Workshop (need lots of polishing, as always)

Crazy times we live in! And I am thankful for this block of time on a weekend 🙏

PyQt: GroundSpace

Written by  on September 4, 2020

Over the past weeks I’ve worked on a small project to combine the best of the Qt and Python domains. It was time to put both together. I knew about the PyQt- (Riverbank) and PySide- (Qt) bindings for years, but never really dipped my feet into those water. It was time to fix this.

GroundSpace (wordplay) is a small tool to fill your hard-disk (SSD ..) with arbitrary content. To test the speed of writing and to create big chonks of data.

What was learnt?
* creating an ui-file with QtDesigner (jk, I knew this) and how to pre-compile it for PyQt-usage
* loading that uic-file and creating connections
* progress-callback
* how evil the ‘eval()’ function in Python is

Next stop: I want a proper web-scraper in Python.

GitShortlogToPieChart (Python: Git ➔ matplotlib)

Written by  on August 17, 2020

Time to turn the spotlight on for a tiny project I’ve finished five weeks ago. The plan was to create a script, which would retrieve the of commits for the current repository for each committer (without those distorting merge-commits) and create a piechart-plot out of it and save as raster-graphics.
Implementation was more or less straight-forward, but again – I learned a lot. Talking is one thing, creating some usable proof-of-concept is the other. And words are cheap. No matter how triftling the task may seem, action speaks louder than “ah, shouldn’t be a problem”.
Actually I had done this before the Python-graphics-workshop, because even before I thought the matplotlib is quite a mighty tool which will come in handy.

Project can be found here: GitShortlogToPieChart

Call like this:

workshop: Graphics with Python

Written by  on August 11, 2020

Initial plan was to visit a course at the VHS (MVHS: Münchner Volkshochschule) about ‘NLP with Python & DeepLearning’ (natural language processing). But the tutor quit, so I checked what else I could learn! Notes and examples are archived here: graphicsWithPython
The course took place on two evenings. Lecturing person was Dr. Günter Spanner. We coasted through examples with matplotlib, tkinter and pygames.


Of course, tkinter is available out of the box with newer Python-distributions. But the resulting GUI is butt-ugly (I feel like using those UNIX-workstations in the first semester of computer science..) and you have not much influence on the layout. Since I am working for some while now in the background with PyQt (will be covered in one of the upcoming posts), I can say: good that I had a hands-on, but I will NOT use that.


matplotlib: high value in quick generation of plots of all kinds (bars, line-charts, pie-charts, ..). I’ve used it before and I guess this is the main earning from this learning-opportunity.
https://raw.githubusercontent.com/marcelpetrick/graphicsWithPython/master/exercise_breakingDistance.png


pygames: loading some graphics, adding a game-loop, reacting to user-input, all fine. But would require some additional effort for understanding. Maybe in the future.


Conclusion:
Of course, a two-day workshop can’t provide you with credible knowledge and expertise for three frameworks. But having a teacher can ease the starting-pain and allows quick feedback in case something does not work. For me it was also a good opportunity to have some exchange with people and some learning-atmosphere. Also: since tkinter is so butt-ugly, I got further momentum continuing my PyQt-project.

MicroPython (uPython) – the future is here

Written by  on November 27, 2019

My activity with the ESP8266/ESP32 boards had somehow fallen asleep after I set up one of the ESP8266 as Wifi-repeater. It worked, but creating own devices was too cumbersome. Firing up the ArduinoStudio took ages, building and downloading the firmware in C++ was error-prone, took ages (seriously, this is a tiny program, what the hell happens in the background?).
But I knew there exists a path, which could save some time: µPython (micropython). It runs a firmware, you just deploy your “code”. In my case now a tiny hello-world-like program.

I used this and that tutorial and the uPyCraft-IDE. Got it working with an ESP8266 in minutes.

Seriously: goodbye crappy, non-structured and slow-to-build C/C++ for my microcontrollers. The future is here o/

By the way: the motivation also comes from my current daily practice of Python (of course, I still contribute to C++/Qt-based projects), but my current flame is Python (for Project Euler and daily coding challenges).

python: maximum size of certain containers

Written by  on July 9, 2019

Getting the 64 bit version is quite important. Still don’t get it why for Win the 32 bit one was preferred ..

32 bit: 2147483647 (elements)
64 bit: 9223372036854775807 (elements)

20190711 edit: even if the container could keep that much elements – remember that [Boolean] is 24 Byte (intead of one Bit) in vanilla Python. Means: if you run out of real memory, then MemoryError :/

Education 2019

Written by  on June 27, 2019

Time to reveal the plans of the supervised education for 2019: today my course from the “Volkshochschule” (adult evening school?) for “machine learning with Python” starts (you remember the course for advanced Python I did in May 2018 and which I backep up by doing exercises for Project Euler and “365 days coding challenge”? link ). I am really, really looking forward and am eager to learn 🙂

After this is finished in June my course for basic introduction to Mandarin will start (not a ‘computer’ language, but an interesting one) in addition to my daily Duolingo-exercises for this upcoming lingua franca (from my POV). Will be a challenge, especially the spoken version with its intonations.

Last but not least: on week ago I participated in a course as first responder for emergencies (at work).

And I still plan to do a professional certified course for software architecture (the big picture) in autumn.

edit: repo with a apart of the notes from the machine learning with Python-course are at Github.

advanced whitespace-correction

Written by  on February 26, 2019

(for CMake/C++-projects)

Find all fitting files and run the fixer-script in parallel over it.

After playing for a while with sed and awk and not being able to get a fitting solution, I decided to create my own as python-script. It squashes all consecutive double-whitespace-lines (and adds one to the end if missing).

Or “run it on the files changed for the last commit”:

Sources:
removeTrainling.sh
squashMultipleWhitespace.py

How to get GNU parallel (developed by Ole Tange):

Retrospective view at 2018

Written by  on February 1, 2019

The first month of 2019 already passed. And we passed it with flying colors!
But let’s have a look at 2018 – a year full of challenges and success: I’ve worked full-time, organized and participated in advanced courses for Python and in Requirements Engineering (officially: IREB Requirements Engineering Foundation Level-approved) and pursued a new employment as software engineer.

And I wrote some software in my spare-time, as you can see in the graph for the public github-repositories. The gaps in the commits can be explained with the birth of my daughter and the time where I acquired the new job and moved nearly 900 km across the country. Yay! Nice personal projects were and are Cullendula and the Daily Coding Challenges, which I solve mostly with fully Unit-tested Python (3).

More new, hands-on knowledge was gained in the area of CMake and Qt-charts.
Well – 2018 was great. Let me make 2019 greater! 💪