00:00:03.600
Thank you very much for having me here, and thanks for staying. You had the option of two rooms, and you decided to follow me for the next 45 minutes. I hope you won't regret it.
00:00:10.000
Today, I'd like to talk about service-oriented architecture. But before we dive deep into the topic, let me introduce myself.
00:00:16.960
My name is Ole Michaelis, I'm from Hamburg, Germany, and I'm 26 years old. I consider myself a web nerd, user groupie, and sometimes a conference speaker. If you feel I'm talking too much, just tweet me at Codestars.
00:00:27.439
You can find my blog at post.u, and you might know me because I organized a conference called "Slaughterio" in Hamburg.
00:00:36.000
Currently, I work with Jimdo. Jimdo is a website builder aimed at helping everyone create their own websites without requiring HTML, CSS, or JavaScript knowledge.
00:00:45.039
I am part of the template team and have been working with Jimdo for about a year and a half. I started on January 1st last year and have switched teams quite a bit. Now, as mentioned, I'm part of the template team where my job title is somewhat humorously called 'shipment'.
00:00:56.960
In my team, we are the first in my company implementing service-oriented architecture (SOA). We are trying to move services out of our monolithic application, and I want to share that story and what I learned along the way.
00:01:09.040
This talk is titled 'Service Oriented Architecture for Robust and Scalable Systems,' but I feel like this title is a bit misleading. While the content has mostly stayed the same, I want to relabel it to 'Distributed Architectures' instead. This is because people tend to have a certain picture in mind when they hear 'service-oriented architecture' versus 'microservices'.
00:01:16.080
I'd be open to discussions about this distinction, and a whole other talk could be dedicated to where the differences truly lie. From my point of view, it’s really about distributed architectures since both concepts apply.
00:01:27.440
Before joining Jimdo, I worked at an incubator in the middle of 2012. My job there was to bootstrap projects, and every three months, we would have a new project to initiate.
00:01:35.600
I was responsible for choosing the frameworks, programming languages, databases, and all that kind of stuff. Each new project was super exciting because I had a comprehensive vision of what I wanted to achieve. You have to focus on software architecture, right?
00:01:46.000
As a good software engineer, you have a picture in mind of how your software should look, the architecture, making plans, and selecting the right frameworks.
00:01:57.120
The first project we tackled was a mobile flea market application, where the idea was to create an app for people to sell their unused items. We had a solid plan to get started.
00:02:04.240
However, it was a young project, and back then, we used PHP. I learned my lessons during that project about writing quality software. You aim to write beautiful software following principles like hexagonal architecture and decoupling components.
00:02:19.120
But alongside this ideal, there’s significant business pressure to ship quickly, and that's a contentious topic. I had many discussions with my CTO about finding a middle ground.
00:02:34.079
The project ended up launching after six months instead of the planned three. That's just how software development works—so often, we plan three months but end up needing six.
00:02:41.439
After moving through various projects, the incubator ultimately got bankrupt, which is unusual because incubators typically have ample funds. It was a serious and complicated situation, and in the end, they had to let everyone go, including me.
00:02:57.840
However, being a software engineer, we often land on our feet. With the abundance of job opportunities, it wasn’t too difficult to find another position.
00:03:06.480
I ended up at Jimdo, which was quite the change. I came from a two-person incubator where I made decisions quickly and decisively, to Jimdo's established environment.
00:03:13.040
When I joined, Jimdo had been around for nearly eight years. When I started, there were about 150 employees and 50 engineers.
00:03:20.000
The experience showed me what a grown code base looks like—very tightly coupled with a tremendous amount of complexity.
00:03:28.400
There were a million lines of code in one repository, with different databases and patterns scattered everywhere. Some sections were true spaghetti code, while others were just over-engineered.
00:03:35.200
For example, if I wanted to store a picture in the general code base, it affected 16 classes. On the other hand, there were classes with about 6000 methods crammed into a single file, making it a mess.
00:03:44.640
I found it amusing because just a month ago, I had been part of a very small team where all decisions were made very quickly, next to a big code base where the decisions were more complicated.
00:03:55.600
I often thought about how the mobile flea market app would evolve over seven years, and while it may still look the same, it would depend greatly on business constraints.
00:04:02.880
But I think we can all agree that overly complex architecture is detrimental since it slows down progress.
00:04:09.839
In Jimdo, we have felt that shipping features in this aged system takes an unusually long time. For instance, it took us two years to develop the API layer necessary for our iOS app.
00:04:19.840
All the business logic was there; we just needed a new controller layer, and yet it took us two years.
00:04:26.000
So, we face two extremes: on one side, we have easy startups with simple code and innovative solutions, while on the other, we have this complex, grown-up company architecture.
00:04:34.760
Many of you might feel like you're on one side or the other; rarely do we find ourselves in between.
00:04:41.639
It’s a gradual process to move from a simple base to a more complex code base, and when you find yourself in a big monolithic system, it feels like you're stuck.
00:04:50.080
Thus, the importance of knowing when to change your architecture becomes vital before it’s too late.
00:05:00.240
People often ask me when is the right time to change, and honestly, I can’t answer that definitively. It depends on your project and your timeline.
00:05:09.760
For Jimdo, as a bootstrap company without external funding, we have to think carefully about how we spend our resources.
00:05:19.440
Investing in refactoring often does not yield quick returns; it’s a long-term investment that becomes harder when you're bootstrapped.
00:05:28.480
To avoid being trapped in a monolithic architecture with overgrown code, it’s essential to distribute your architecture.
00:05:39.600
Even if you have a large product, aim to have small, independent code bases that communicate with each other.
00:05:46.639
That’s the essence of distributed architecture, SOA, or microservices.
00:05:53.760
So, what is a distributed architecture in terms of SOA? Here’s a quote from Vane Fogus, the CEO of Amazon: "Service orientation means encapsulating data with the business logic that operates on that data, with access only through published service interfaces."
00:06:01.919
For those of you hoping to learn what SOA is, this quote sums it up perfectly. But we have 40 more minutes to fill, so let's delve deeper.
00:06:09.840
Jeff Bezos, known for his role at Amazon, once wrote to all tech employees in 2006.
00:06:16.480
He mandated that all teams expose their data and functionalities exclusively through service interfaces.
00:06:23.680
Overall, the only communication permitted is through service calls over the network, and teams must design their interfaces to be externalizable.
00:06:31.120
He closed the memo by stating that anyone who did not comply would be fired. That's a tough stance.
00:06:39.760
If I received such an email, I would probably quit. However, reflecting on it, back in the day, sticking with Amazon's philosophy was likely a wise choice.
00:06:48.800
Since 2006, Amazon has become the leader in cloud platforms and scalable systems. When we talk about distributed architecture, Amazon serves as a reference point.
00:06:58.639
This email marked a pivotal start for AWS, although the groundwork began in 2003.
00:07:07.360
The last quote I want to mention is Conway's Law, which states that organizations that design systems are constrained to produce designs that replicate the communication structures of those organizations.
00:07:15.040
In this context, how you communicate with your colleagues directly influences the architecture of your code.
00:07:24.000
This definitely applies to Jimdo's operation. Interestingly, we inverted Conway's Law when we molded the architecture toward how we wanted our code base to be.
00:07:31.440
Two years ago, the Jimdo development team decided to break into feature teams that would take charge of their own codebases.
00:07:39.760
Initially, we had one large development team managing a monolith, but then we split into eight feature teams. This led to the birth of our first services.
00:07:48.400
As teams, we want to maintain our independent codebases, which enhances simplicity.
00:07:55.440
There are various benefits to SOA. One of them is faster decision-making.
00:08:02.560
When you reduce the number of people involved in a codebase, you can make decisions more quickly. Have you ever tried to make a decision as a group of 50? It can take forever.
00:08:11.160
Even a team of three can struggle to agree on a new database or logging framework, let alone with dozens of differing opinions.
00:08:22.080
More importantly, splitting teams allows you to distribute responsibilities effectively.
00:08:28.960
For instance, I'm part of the template team, so I handle everything related to templates. I don’t have to deal with payment systems or APIs I don’t enjoy.
00:08:36.279
This specialization reduces complexity and facilitates easier onboarding, ultimately leading to faster development speeds and, consequently, happier customers.
00:08:43.520
Moreover, quicker deployments lead to more satisfied developers. Smaller codebases allow for faster continuous integrations and deployments.
00:08:50.640
In many ways, smaller codebases equate to developer happiness and bring about the startup feel, which some of us thrive on.
00:09:00.000
Working in smaller teams leads to improvements in scalability as well. Each component of the system can be scaled independently.
00:09:07.480
So, if payment services scale becomes a bottleneck, I can enhance the payment part without affecting the template service.
00:09:14.040
As we look ahead, you may be asking how to implement this in your own environment.
00:09:20.480
One essential lesson is to build up your platform by ensuring your applications remain as stateless as possible.
00:09:29.760
The services you write should primarily avoid state. While some state is necessary, it should be minimal.
00:09:36.560
If your architecture has to maintain state, it easily complicates scaling, leading to substantial difficulties.
00:09:43.280
A cautionary tale I want to share involves one of my coworkers at a conference in 2008 who was excited about SQLite.
00:09:52.160
His idea was to give every customer their own database file. However, this resulted in significant challenges.
00:10:01.440
If something went wrong with a customer’s file, it would lead to access restrictions for their server and make recovery very difficult.
00:10:10.080
After five years of that implementation, we finally managed to eliminate it, but it took us half a year to remove it.
00:10:18.000
When dealing with state, be very careful about where it is managed to ensure manageable scaling.
00:10:27.200
Moreover, build your confidence. When planning your services in a distributed architecture, determine the best means of authentication.
00:10:35.000
Do you need a VPN for security? Typically, I believe VPNs create more issues than they solve due to accessibility constraints.
00:10:43.320
When using authentication, HTTP-based authentication should always utilize HTTPS for best security practice.
00:10:51.440
It's crucial to design your architecture with independence and isolation in mind. A service shouldn’t depend too much on shared components.
00:11:01.920
It’s acceptable for services to connect with one another, but try to minimize reliance on shared components.
00:11:10.200
Creating reliability is also paramount. Aim for automation, which facilitates fast recovery. Offloading the management of hard drives to someone else can also save precious time.
00:11:19.440
Focus on your core business goals by adopting single responsibility principles. Each service should address a specific concern.
00:11:28.399
You should avoid creating a single manager class to handle multiple responsibilities.
00:11:36.160
Instead, each responsibility should be handled by its designated service, such as a user service managing user-specific tasks.
00:11:44.720
A crucial design pattern to mention is the circuit breaker.
00:11:51.279
When one service struggles in an infrastructure, it’s critical to avoid overwhelming it with requests, as this prevents recovery.
00:11:59.200
Introducing a circuit breaker at the network boundary will help manage requests effectively.
00:12:07.200
Some implementations of this pattern include Netflix's Hystrix and a JavaScript version called Circuit Breaker.js.
00:12:15.120
Another important pattern is applying back-pressure to a struggling service. Don't unnecessarily propagate errors through the system.
00:12:23.040
You should take action at the service level where the issue arises, like discarding non-critical requests during system overload.
00:12:30.720
For instance, if you gather metrics on system performance, it may be acceptable to discard those metrics rather than overwhelming a service.
00:12:39.440
However, if it's a critical payment request, you must ensure proper handling rather than dropping it.
00:12:48.000
An essential element in a distributed environment is to set reasonable timeouts.
00:12:56.640
One fantastic example highlights Travis CI, a continuous integration platform, where an excessive timeout caused significant delays within their queue.
00:13:05.120
When Github modified their API without warning, the timeout took ten minutes, causing the queue to overflow.
00:13:12.960
Always set reasonable timeouts, as they are crucial in not overwhelming your system with requests.
00:13:20.160
You might have heard that people can only remember about three things from a presentation. Well, one point I want you to remember is to always default to timeouts.
00:13:29.360
Now, let’s discuss how you can integrate your services.
00:13:37.440
You can leverage high-level protocols such as Thrift, XML-RPC, or Protocol Buffers. Each has pros and cons depending on your specific use case.
00:13:44.240
If you’re developing internal services, Protobuf and Swift may be beneficial; while external services often favor RESTful HTTP communication.
00:13:52.320
Message queues like RabbitMQ and Zookeeper are also reliable options for handling service interactions.
00:13:59.920
For example, one service can produce a job in the queue, while another retrieves the job from the queue.
00:14:07.568
I advocate for HTTP APIs as a straightforward interface for service communication. When implementing, be mindful of headers such as accept and content types.
00:14:15.520
I've had personal experiences where APIs return unexpected content types, which aligns with specific implementation issues.
00:14:23.920
It’s crucial that your URIs point to resources and that format requests are handled through headers, rather than using file extensions.
00:14:31.360
Headers should not be included in the payload if they're relevant to headers—implement them in the correct way.
00:14:39.840
Lastly, when integrating services, it’s essential to ensure they form one cohesive product.
00:14:49.280
In conclusion, it's imperative to consider both technology and human aspects in distributed architecture.
00:14:55.920
At Jimdo, splitting the team structure meant we didn’t just split the components. It also encouraged increased communication and collaboration.
00:15:05.360
Ensure that as you decentralize architecture, you maintain the essence of team communication to prevent unintentional redundancy.
00:15:14.240
Even as we address the merits of SOA, we must always be aware of its downsides, such as logging and monitoring challenges.
00:15:22.400
In a distributed architecture, coordinating logging becomes complex as different teams may adopt various logging infrastructures.
00:15:31.040
When debugging, navigating through different monitoring services can be quite cumbersome.
00:15:40.800
Another issue arises when failures occur. Poorly handled cascading failures can lead to chaotic situations.
00:15:48.720
As each service struggles, orchestrating recovery becomes a challenge—determining which service to spin up first can be a complicated task.
00:15:57.920
In distributed systems, failure is inevitable. The question is not whether it will happen but when it will.
00:16:06.080
Finally, responsibility in a distributed architecture can be a double-edged sword. Though accountability is important, it can manifest in implicit ways.
00:16:15.120
Each team must recognize that they are responsible for their provided services, which reinforces the importance of their role.
00:16:20.960
Having a resilient system means that even when a service fails, the repercussions should ideally be minimal for users.
00:16:28.640
Netflix often emphasizes the importance of resilient systems through their design—users aren’t usually aware if a specific service isn't performing.
00:16:37.440
To wrap up the discussion, I'd like to share some real-world examples of distributed architecture in action.
00:16:45.440
Yammer shared a diagram at a recent conference showcasing many services they implemented. It demonstrates their approach, although it can be hard to follow all the interdependencies.
00:16:53.440
One interesting aspect was a service called Vario, which had no dependencies at all—a noteworthy ambition.
00:17:00.960
Another example comes from Twitter. They transitioned from a monolithic design, which they affectionately called 'Monorail,' to a microservice-oriented structure.
00:17:08.000
After realizing their existing structure couldn’t scale, they managed to start breaking it down into smaller components.
00:17:17.680
It is essential to note that completely eliminating the old monolith is a monumental task that takes dedication.
00:17:27.120
We are currently embracing similar approaches at Jimdo, where our initial dynamic templating service is being split into smaller components.
00:17:34.960
As we drill deeper into our structured services, continuous communication remains vital as teams manage responsibilities.
00:17:42.640
In summary, while SOA has inherent challenges, it also lays the groundwork for scalable architectures that can provide happier developers and end-users.
00:17:51.360
Ultimately, each part of a service can be independently managed and scaled, which significantly enhances the overall efficiency of the system.
00:18:00.080
Thank you all for your time. And I would love to take any questions you may have!
00:18:06.720
I hope you remember the three key elements from this session: timeouts, HTTP, and distributed architecture!
00:18:13.680
If you have any questions, feel free to approach me throughout the day. Thank you!