00:00:14.960
Hi, my name is Andrew Turley. I'm a junior lead software engineer at The Ladders. Today, I'm going to talk to you a bit about COBOL.
00:00:20.400
So, first of all, who here knows COBOL? All right, so this should be pretty easy to put one past you guys. I'd like to talk a little bit about what COBOL is and get into a little bit of the history of the language itself.
00:00:29.920
Back in 1959, the landscape in computer programming was that you bought a computer from someone like IBM or Honeywell and you got the programming language with it. There really wasn't a cross-platform language for writing code, which meant you were tied to whatever piece of hardware you bought. This made it difficult for customers, as they often faced the issue of not liking the language of the machine they purchased. In 1959, a woman named Mary Hawes was at a conference and started cornering people, saying there was a need on the business side for some sort of common programming language that could be used to write business applications.
00:01:04.000
She began assembling a team of people interested in creating this language. By the middle of 1959, they had started an organization to develop COBOL. They established a long-range committee responsible for strategic long-term planning, a medium-range committee for planning the next few years, and a short-term committee tasked with designing the language itself. The short-term committee was the only one that actually ended up doing anything. In December of 1959, they produced the first COBOL specification.
00:01:38.640
According to Gene Samut, who was on the short-term committee, the driving ideas behind COBOL were to create a language that was natural and could be read like English, making it accessible to business users. They wanted ease of transcription to the medium, which was critical at a time when codes were often entered on teletypes using five-bit character sets. There were academics who proposed languages that specified characters that were practically impossible to type on contemporary machines. The goal was to ensure that COBOL’s language specification matched what users could actually type and print out.
00:02:09.840
The third requirement was for a structure that allowed problem specification within the language, and finally, they aimed for implementability. The initial idea was that the first word of every sentence would be a verb so that you would effectively be programming in an imperative style with a limited number of verbs that had many options. The 'GO TO' command would be permitted after every statement, allowing easy jumps within the code. This was before Dijkstra's famous 'Go To Considered Harmful,' and back then, many believed the 'GO TO' command was a good idea.
00:02:59.680
It was also thought that new verbs could be added at any time, leading to initial beliefs that the language would be extensible. However, by 1965, this feature was removed from the specification because no one had actually implemented it. Quick word about functions—during that time, there was significant academic thought regarding computers, with many complex papers published that confused the business community. The prevailing belief was that functions were too complicated for non-mathematicians to grasp.
00:03:25.440
So rather than integrate functions, COBOL developed its own system of verbs that could be chained together, making the language easier to understand for non-programmers. This reliance on verbs continued, and by 1989, built-in functions were added, while user-defined functions finally appeared in the specification in 2002. Before that, manufacturers had created their own functions that were often ad-hoc.
00:04:02.560
Now, I want to discuss some of the things that we gained from COBOL, both good and bad. One of the first positive aspects was the enforced separation of concerns, which can be quite beneficial. COBOL programs are divided into four divisions: the identification division, the environment division, the data division, and the procedure division. If you look at most modern programming languages and environments, you'll find that few enforce such separation.
00:04:34.560
Unlike COBOL, which mandated this structure, many programming environments leave you to handle identification and configuration details on your own. There is also the inclusion of an identifiable section for declaring data, which is crucial for readability and maintainability in programming.
00:05:08.960
I've got an example from Storm, which is an event processing system. Storm has a Clojure DSL for performing similar types of work as COBOL—processing data. A definition of a bolt in Clojure bears a significant resemblance to COBOL’s structure, with clear identifiers and a logical layout for output and input processing.
00:05:48.400
Systems we use today offer various levels of this organization and clarity, but COBOL was the first significant language to encourage this kind of structured thinking in programming languages and systems.
00:06:07.760
Another point to consider is that naturalness in language design is subjective and not always a reliable measure of a language's quality. Let's explore some real COBOL code to illustrate this point. When you see a COBOL program, hopefully, you grasp what is happening—it reads simple records from a file and copies them to another file for printing. If you have even a loose understanding of its function, you can identify the program's parts quickly. In fact, I'd argue that, even if you weren't a programmer, you could read COBOL code and deduce its purpose.
00:07:02.560
However, the simplicity can lead to verbosity, particularly when dealing with more abstract concepts. As those who've worked with COBOL can attest, the insistence on using natural language can sometimes hinder more advanced programming tasks, making it difficult to work with mathematical operations or string manipulations due to the resulting verbose syntax. The issue with COBOL's popularity boils down to the challenges it poses in writing efficient code, leading to the vast number of existing lines of COBOL.
00:08:14.960
When discussing programming languages today, you might hear someone express admiration for Ruby because it's intuitive or natural. What they typically mean is that Ruby's syntax resembles something familiar and accessible. This was true for COBOL's time, where the analogy was English. Now, most programmers are accustomed to languages that incorporate functions, and later languages tend to strike a balance between accessibility and complexity.
00:08:49.920
Finally, a critical takeaway from COBOL is the reminder that ETL—extract, transform, and load—will always be relevant. This process essentially defines the interactions we have with databases and mirrors what COBOL was designed to achieve. COBOL should not be seen as a general-purpose programming language; instead, it is better viewed as a domain-specific language (DSL) and a framework for ETL tasks, indicating how data is pulled from one location, manipulated, and then written back to another.
00:09:13.760
I have a visual from a COBOL book that illustrates the ETL process beside a diagram from Apache Pig documentation, a framework built on Hadoop. These visuals highlight that the tasks we perform with modern tools aren't much different from those established by COBOL. The same principles apply today as we continue to seek languages that express these ideas effectively across big data and event processing challenges.
00:09:48.000
In conclusion, you can find the references I used for this discussion at a shortened URL: bit.ly/gorucco-cobol, where you can access documents and articles related to COBOL. Unfortunately, many COBOL resources are not freely available online, but this URL links to some ACM articles and books that can help you further explore COBOL's history and relevance. Understanding our past is crucial, and I encourage everyone to delve into it.
00:10:16.800
Thank you, and go learn about COBOL!