pstree

Macros are great; ffs stop using macros.

Essam Hassan — Fri, 21 Feb 2025 12:29:55 GMT

Macros are awesome. Macros are powerful. Macros will single-handedly turn your simple, readable codebase into an incomprehensible Lovecraftian nightmare of arcane symbols and unpredictable behavior. And yet, every day, developers continue to reach for them like a moth to a flame.

So, what’s the deal? Why are macros both a godsend and a cursed artifact? And why, for the love of all that is good and holy, should you stop using them in most cases?

You know the drill. Take a moment before scrolling down.

Wtf are macros?

At their core, macros are a way to tell the compiler (or preprocessor, depending on the language) to replace one piece of code with another—usually at compile time. This can mean anything from simple text replacement (#define in C) to full-on metaprogramming sorcery (Rust’s procedural macros, Lisp’s everything).

Macros can be used to:

Generate repetitive boilerplate
Optimize performance by inlining code
Perform compile-time computations
Completely wreck your debugging experience and make outages a living hell.

Macros are great! (Sort of.)

Macros exist because they solve real problems. Let’s take an example in Rust:

macro_rules! make_struct {
    ($name:ident) => {
        struct $name {
            data: i32,
        }
    };
}

make_struct!(Foo);
make_struct!(Bar);

Boom! We just created two structs without writing the same code twice. Feels great, right?

Or in C:

#define SQUARE(x) ((x) * (x))

Nice! We don’t have to write a function for squaring a number. More efficient, right?

But then, some poor soul writes this:

int result = SQUARE(5 + 2);

And the compiler happily expands it to:

int result = (5 + 2 * 5 + 2);

Which evaluates to 14, not 49. Congratulations, you have just entered Macro Hell™.

The dark side of macros

While macros give you power, they come with some very nasty trade-offs:

1. Debugging macros is pain incarnate

Ever tried stepping through a macro in a debugger? It’s like trying to read a book that’s being rewritten while you turn the pages. Since macros operate before actual compilation, they don’t exist in your final code in a way that debuggers can follow.

2. Macros break syntax highlighting and tooling

A good IDE can handle functions, classes, and modules. But when you start using macros, all bets are off. Code completion stops working, error messages become cryptic, and syntax highlighting goes on strike.

3. Macros introduce hidden complexity

What looks like a single macro call could expand into a monstrous web of conditional branches and recursive templates. Your future self (or your teammates) will curse you when they have to decipher the spaghetti code generated by an innocent-looking macro.

4. Better alternatives exist

Most modern languages offer built-in solutions that achieve the same goals without the pain of macros:

Inline functions (C++, Rust, etc.)
Generics and templates (C++, Rust)
Metaprogramming tools (Python decorators, Rust’s derive, etc.)
Prebuilt utilities (Why write a macro when a standard library function exists?)

When should you actually use macros?

Let’s be fair—there are cases where macros are the right tool for the job:

Domain-Specific Languages (DSLs): Rust’s tokio::main macro, Lisp’s everything.
Compile-time computations: If your language lacks constexpr-style evaluation.
When no other alternative exists: Some low-level performance-critical cases.

But for everyday coding? Macros are overkill. Just use functions, generics, or templates instead.

Conclusion

Macros are powerful, but with great power comes great unreadability. Use them only when necessary, and for everything else, let the compiler do the work with more maintainable alternatives.

If you must use macros, at least document them properly, or risk summoning eldritch horrors into your codebase.

What people don't get about decentralization

Essam Hassan — Mon, 27 Jun 2022 14:19:02 GMT

This post is a thought dump of a 40-hour worth of no-internet flight time I had over the last week. It was rough.

Intro

This is not meant to be a linguistics lecture but more of an attempt to set a mental model for what the word decentralization in the blockchain space can mean. You probably ran into angry tweeps claiming true decentralisation is not possible and crypto bros saying governments will be managed by DAOs in the future. But are they talking about the same thing? Can we define what decentralisation is? is it even one thing?
The goal from this blog is to go over some of the decentralization ethos, how they can manifest in the crypto space and what does that all mean for you.

New word: Polysemy

I was trying to find the English expression for "A word that has the capacity to mean many things" and it seems Polysemy is just that. In engineering capacity, I'm usually very reluctant of using the words decentralized/decentralization/decentralize liberally without clarifying "Decentralization of what?"
So I guess this is a good space to go over all the ways Crypto is decentralized (and sometimes not at all).

The core definition of decentralization is the dispersion of power. It's very important to let this settle in. My view of decentralization is that it breaks down powerful singletons but it does not promise complete freedom. It follows economical consensus in decision making. Blockchains build over this concept by implementing different economical guarantees to incentivize productive behaviors and disincentivize malicious ones.

Decentralization of Opportunity

This is the most obscure term but the one I like the most. Decentralization of opportunity means that opportunities in markets (DeFi, creative, insurance, etc) should be fairly available across the globe.

One of the usually debated topics about crypto is "why the hype on a seemingly expensive way to do exactly the same things we used to do before?" At the end of the day, you can browse the internet, watch videos, trade stocks and put your money in a bank without blockchains right?

Right?

For me, decentralization of opportunity is one of the biggest factors that sold me on crypto and blockchains. If there is a better alternative that offers the same decentralization of opportunity the same way crypto does, I'm a buyer.
There are many many examples how the current financial/technology sectors are NOT fair and walled from the public. It becomes much clearer when you study how VCs behave where their portfolios are in conflict, the "buy the market first" by company S and it's backers or nasty deplatforming techniques by company A. It's also very apparent for anyone who spend 10 mins to study the current financial system especially on the trading arm. I want to dig deeper into both themes starting with the latter and break down how decentralization of opportunity has a huge potential.

Why it's not fair for you to trade on the stock market today?

To place a trade as a retail investor, let's say for AMD stock, you are placing yourself under the mercy of 3-4 corporates. Let's dig deeper how this happens:

1- Alice places a trade BUY 100 AMD on broker X

2- Broker X now does two things, it gets it's slippage/spread % and trading fee (% or const) from Alice's account and match Alice from an opposing trade on it's order book (Think about it as a SELLER selling 100 AMD, but it's more complicated than that)

3- Moreover, that SELLER is a market maker 75-80% of the time (US market datapoint) and is the SAME market maker 35% of the time. In a traditional financial system, a market maker is a corporation that pools money from wealthy individuals, trades with very large volumes with the goal of making money by buying from sellers at price and selling again at a higher price. They benefit from much better prices, they can front run retail (most market makers are literally hardwired to exchanges so they can submit trades quicker than anyone) but they pinky promise they won't do any of that. I guess you'll have to trust them.

4- Taking it one notch up, Your broker can (and boy they do) share your trading data to a market maker before they execute your trade, giving them a huge advantage on the price they set for your order (and other people's)

5- On the blurry lines between what's legal and what's right, Your broker can totally manipulate trading and halt your trades if it doesn't work out for giants footing their bill.

The biggest things to notice about this paradigm is not the flagrant unfairness of this model or the strong misalignment in incentive. For the sake of our attempt to define decentralization of opportunity, the most important part in our previous model is permissions. To become a market maker you need permission from every exchange, to participate in market making you need to meet a very large wealth requirements. Market makers are very picky of whom they manage the money for. Market making is mostly restricted to regulated markets like equity and commodities but there are very very few market making companies for private equity, art, other niche investments.

To trade you'll need a permission from the broker, and you are bound by that permission that a broker can take away anytime they feel insecure.

What does crypto offer (or aspires to*)?

Less hops between you and services both as a consumer and as a service provider. A decentralized application allows you to use it as long as you have a wallet on the blockchain. You like it a bit too much? run a validator node, contribute in governance, stake value onto the chain.

Projecting this to DeFi, you place trades through an automated market maker where ANYONE can participate and benefit from same slippage and APY without third parties taking the larger cut. Even better, BE the market maker, be the liquidity provider to traders and benefit from the return traditional market makers are making with your 100$ bill without any approvals, min requirements, etc.

Pitfall #1:
Most crypto trading at the moment is happening on custodial exchanges. Not at all decentralized and rarely benefiting from all the previous stuff. Stuff like Celsius (the def not a bank) and CEX where you don't own your money are NOT DeFi. They have the same parasitic effect on Crypto as early Tech companies had on computing and then the internet. Trying to wrap a social endeavour into a money making machine. Celsius investment strategy is literally putting your money in DeFi protocols and making a cut on it. They are the not-so-creative middlemen of crypto. Same goes to most CEXs.

Not your keys, not your money.

Decentralization of Initiative

Decentralization of initiative is not something new, not something crypto brings to the table. It's probably here for ages. I have the urge to define it and point it out because this is usually what people overlook when they are looking at the crypto space holistically. Blockchain efforts are not linear, people from all over the place with all different backgrounds are trying and experimenting different ideas ( ∀ Idea I(X, Y) s.t X ∈ [Brilliant, Dumb], Y ∈ [Succeeds, Fails] )

Outsiders usually view crypto as one pot of action. Even the most educated ones who try harder will probably have a split view of crypto and nothing more, memecoins and serious initiatives. But the thing is, crypto is way too large to abstract away as any number of teams. It has all the colors and all the types of mindsets. Crypto have the builders who succeed, builders who fail, scammers who succeed, scammers who fail, etc.

I love that about crypto. Crypto ethos are being implemented in a decentralized manner. Builders are opinionated and critical of other builders. Also they are very smart (Most of the time). Companies building in the crypto space have VERY smart engineers and scientists working on extremely hard problems. Turing award winners are doing research on crypto, many renowned distributed systems engineers and highly cited scientists. Researchers behind Paxos, ZKP, BLS, etc are working actively on ensuring guarantees that twitter critics can do a "hot take" on a Twitter thread. Do your own research.

The core teams (the actual builders, both who succeeds and those who fails) are the most critical about each other. They are VERY opinionated. The founders of every L1 don't agree on most things. Most of them are genuinely trying to build something that fulfills the crypto ethos.

Most of the valid criticism coming from crypto critics is not new material. It's well known problems that crypto folks are actively working on solving. Things like MEV, Discovery, State pruning, etc are well known and actively being researched.

Pitfall #2:
Any niche new tech should be actively criticized. Especially when it's being built for public use. Crypto is it's first steps into mainstream use. The first email ever was sent in 1971. The first commercially available email came 30 years later. You probably used emails 40 years later. Big things take time. Internet was first proposed in the 1960s, after many many iterations TCP was created 1970s, adopted by DARPA for military use 10 years later and became public 25 years afterwards to only become mainstream 35 years after that. The idea that internet took 10 years to be what it's or that it's been so long for crypto is so naive. The only difference between now and then is that before, things were invented behind a curtain, they broke behind a curtain and iterated on behind the curtain. Crypto failed repeatedly, in public. You can open the tx that drained UST and read the code for the exploit that exploited Polynetwork (heck, you can try the exploit yourself). Early adopters always risk high volatility/bugs/vulnerabilities this is true for any kind of investment. If you expect 10x but not 0.1x, life ought to teach you otherwise.

Decentralization of Governance

One of the core misconceptions about crypto is that it's a software-first initiative. While the software is complex enough for software engineers to shed tears when debugging distributed systems bugs in an open source p2p software, the most important part of crypto is the social layer. Most dApps are open sourced contracts ANYONE can read, fork, audit, improve, out, etc. The most important reason ETH or BTC ever had a price (Beside on-chain value generation) is that there are at-least two people who agreed to exchange these coins at that price.

Decentralization of Governance is the dispersion of centralized authority into a broader social consensus.

It means no single person/entity should be allowed too much authority over a network. This can be defined in so many ways but there is one way that matters to me personally. People should be able to be anything, do anything on the network as long as the economical consensus on the chain allow it. The idea that web3 is built for criminals and others is painfully detached from reality. A web with no governance would suck for everyone. Tor is a very practical life experiment. It had MANY great utilities. Having censorship resistant web makes it very hard on govs to censor journalism the way many countries did on the current web. Decentralization of authority make it so not one politician with too much power can BAN something on the internet just because they don't like it. Blockchains aspire* to only adhere to economical consensus. The problem with Tor is that it came with a price in the form of many dark corners. My view of Web 3.0 aspires to deliver on the censorship resistant content that only works with economical consensus mechanisms in place. If the chain's economical consensus want wikileaks, wikileaks stay. People with stake control the content.

Pitfall #3:
Borrowing from the prev section "Decentralization of Initiative", Decentralization of Governance can't be maintained if you entrust a middleman like Centralized Exchanges or Definitely not banks like Celsius. Also, due to being the youngest of the crypto family, it still need some working on.

Pitfall #4:
Governance v. Delegation: In a blockchain space, most of the software is obscure to non-tech people. As a non-technical person, you feel you are trusting a specific developer to deliver on their code. And that might be true but it's very different from today's model. Successful protocols have their code open-source, they are audited by the public, they are checked by every techie with stake in the matter. You are not trusting the specific developer building the protocol, you are trusting the social consensus that this is a good/bad protocol.

A glimpse into the future of Web 3.0

This is purely a personal interpretation of many many conversations I had around where folks see Web 3.0 going. It's prone to change, prone to collapse or evolve. It's a speculation for that matter.

Your music is supplied on a decentralized app, they are held by their owners as NFT/Whatever-Standard-Emerges and you pay royalty to stream, you pay less for your subscription, your money goes directly to musicians publishing their music as royalties instead of paying ~90% to the platform. % goes to stakers, those are people, some can be corporates, but mostly, people.

Your ride hailing app is decentralized, the largest portion goes to the driver, instead of 25% cut going to the platform a much less % goes to node operators. You like a certain ride hailing app? be a node operator, be a staker. No permissions required.

Stock market is tokenized, 24 hrs. Market makers are automated. No front running, no opaque pricing. Anyone can be a market maker. Microloans are instant. Mortgages are equitable and privacy-maintaining . Insurance is parametric and provably fair.

You own your data on the blockchain with a decentralized identity. No need to ask platforms to delete your data because they don't have your data. You onboard your data to the social network when you login. You allow them to write their own "Directory" to your data. You move your data around platforms and at will, you can delete any piece of data.. Forever.

You get paid for watching ads, you selectively choose to share your data with ad networks (and when to not) That money goes between you and the platform running the ad placement so they can fund your app usage. You get to choose to fund your usage with money instead of watching ads.

You use your money, both tokens and fiat, liberally. You can send money overseas for a fraction of the cost in a fraction of the time. The idea that crypto is slower than current platforms baffles me. It's an insanely misinformed idea. And that's even without considering things like L2 and ETH 2.0.

That's it for today.

Wtf is Gossip Protocols?

Essam Hassan — Fri, 18 Mar 2022 16:31:00 GMT

This is part of a series of posts explaining cryptic tech terms in an introductory way.

Disclaimer: this series is not intended to be a main learning source. However, there might be follow up posts with hands-on experiments or deeper technical content for some of these topics.

Motivation

This post extends on a problem from our previous post about rendezvous hashing, Assume you have a system consisting of a cluster of serving nodes that simply takes a request from a client and routes it to the correct back-end node after altering some aspects of that request.

Using rendezvous hashing gives us a node we should talk to (the top of the list Cn) but it also gives us an ordering for fail-over if the node failed. This is very interesting because it didn't only solve the load balancing fail-over issue but it also established whats called distributed k-agreement. Imagine a write request coming to the nodes, this write need to be replicated across nodes in some way so if the node that received the write request failed, the client can fail over to the next. In our scenario, if the client and serving nodes share the same H(Sn, X), nodes can decide the order of replication and create a replication chain that follow the same order as how the client would fallback. Assume we have a discovery service that basically just keeps track of all the nodes around, all members can distributively adapt to changes in the nodes structure without having to communicate with the other nodes.

So my previous post covered what the client should do if a serving node failed and what the serving nodes should do if one of them failed. Yet, without a proper way to know serving node failed as soon as it failed, we can arguably imagine the cluster shrinking back into a single node taking all client traffic.

Think of it this way: All clients will have some ordering of N nodes, N-1 failed and a single working node.

It seems we need a way for nodes to learn about other nodes state and event and act accordingly. But how can we do that?

You know the drill. Take a moment before scrolling down.

Multi-node system

Use all the health checks!

Me overly excited about health checks. Sue me.

Well, yes and no. Health checking is a very nice feature that allows some node ( supervisor node?) to probe other nodes through network link to make sure those nodes are /healthy/ whatever that means. It can be as simple as pinging a port and see if it responds, or calling an RPC endpoint that does health diagnostics and return results of the node health. sometimes in critical systems, both types of health checks can be mixed, you can have a more frequent but lightweight health check and a less frequent but more involved diagnostics check. It's usually used in load balancers to decide if a node is healthy enough to send traffic to. What's different in our case is that we don't have a supervisor node. All nodes are both normal serving nodes but they are collectively responsible for the overall availability of the system. Additionally, like any distributed system, nodes can lose communication through network partitions (think of network partitions as your phone temporary disconnecting when passing through a tunnel, you know once your out of the tunnel things will get better but for the time being, you are cut off)

So maybe health checks are not the best way to go. What else can we do here?

You know the drill. Take a moment before scrolling down.

Use all the heart beats!

Let's try to analyse this idea. All the nodes know about all the other nodes, every N units of time, the node calls all the nodes it knows about and say "hey I'm fine", to which the nodes respond "Oh, good to know you are fine. I'll make sure to remember that for the next N units of time"

How can this friendly banter ever break systems?

You know the drill. Take a moment before scrolling down.

This can be /okay/ on 3-4 nodes. But consider the amount of CPU bounded work nodes need to do just for health checks, even worse consider the amount of bandwidth these health checks will imply on the network. For N nodes, there will be N² messages floating around just for health checks. Furthermore, because CPUs and network links between these nodes are at different loads, this will cause time deviation in the tickers between the nodes and what you'll end up seeing is a constant surge of health checks coming in at all times. There is more technically involved discussions on why many engineers believe that heart beats are rarely a good choice for health checks but that's a topic for another time.

So, what do we do then?

Gossip protocol is a standard protocol for leaderless distributed systems. Many distributed systems rely heavily on Gossip protocol for failure detection and message broadcasting. It follows a very simple model. Imagine you are in your cubicle in an office and you just heard some bad news about a colleague K who got fired (unhealthy node). You, being an active person in the office gossip culture, want to spread that news but immediately realize it's very inefficient going to every single person telling them what you know (heart beats) and it's also inefficient to wait for people to come to your desk (health checks) so you decide to let the folks around you about the news. You rely on the social contract that those nodes are equally active in the office gossip culture and will spread this message across the office like an infection. now everyone knows that K got fired and they won't share data with them (replication) and won't ask them to deliver on their tasks (serving traffic). Well this is a fun analogy but it doesn't cover all of the core concepts of gossip so let's dive in:

Continuous communication

Gossip relies on the idea that communication is continuous and periodic. You don't just talk to Lisa when someone is getting fired, you talk periodically about everything that's happening in the office so that Lisa knows you are okay. If Lisa doesn't hear from you for a while and no one else is mentioning they saw you in the office, Lisa will assume you are on leave or might've gotten fired yourself.

Each node maintains a state of propagated news. The state can include, list of nodes I heard from in the last N time units (heard from means either they talked to me directly, or someone who talked to me have them in their tracked list) as well as some diagnostic data that we won't cover here.

Neighboring v. Random

One of the decisions engineers make when implementing gossip is how to pick the nodes for next step of gossip. This can be either a random peer selection or a more deterministic neighboring peer selection. Imagine you have some gossip you want to share, you have two options: either share with neighboring cubicles, those are folks you know for quite sometime and you probably have a closer bond. You can rely on their availability more than others (less probability of a network partition) and it makes the gossip spread more predictably (if all nodes picked the two most neighboring nodes, you can simulate a gossip graph deterministically)

The other option is to select nodes randomly from all the nodes in the cluster. One of the arguments for that approach is that it increases the chance of a faster spread across different network links. Imagine you have multiple availability zones, if you go with neighboring peer selection, gossip will only hop to availability zone B after everyone in availability zone A learned the news. That's not optimal for disaster recovery. even worse, imagine the delay between the gossip originating in availability zone A to be picked up by nodes in availability zone C.

I've seen implementations of a hybrid approach where a middleman determines whether this piece of news should be propagated randomly or to neighboring nodes based on the news content. Of course, this adds complexity to an already complex protocol as well as a round trip latency. One approach is to only do the round trip when in doubt. i.e: if all good, use neighboring, if there is a disaster, call the service to do a more informed peer selection.

Networking Protocol

While usually this protocol is implemented in higher communication stacks, due to the nature of drop resilient communication and low message overhead, sometimes they are implemented in more efficient communication protocols i.e multicast for a much lower overhead on the network and of course less guarantees on delivery.

Because every node only calls a subset k of the nodes with gossip (usually 2). The gossip will propagate through the cluster in Log₂N time.

Disadvantages of using Gossip Protocols

Standard implementations of Gossip protocols rely heavily on the idea that nodes are not malicious. If you opt in for a standard implementation, this means one of the nodes clusters can abuse the system if it was breached by hackers. Security-aware Gossip protocol implementations trade off simplicity and overall latency for a more secure approach.
Standard implementations of Gossip protocols have weak ordering guarantees, which means, due to latency and uneven load distribution among other things, your system can be prune to having two (or more) views of the worlds simultaneously. This is known as "split brain" problem. As long as implementation have guarantees for eventual consistency and the system is designed to tolerate eventual consistency this should be fine but it adds multiple considerations when extending the system.

Who uses Gossip?

Gossip is being used by a large number of distributed systems. Cassandra, Consul, CockroachDB, DynamoDB among the most well know. Also most blockchain-based projects rely on gossip protocols to broadcast new gossip to participating nodes in the peer network.

Simple problems at scale: log tailing

Essam Hassan — Wed, 17 Mar 2021 13:25:15 GMT

Simple problems at scale is a series of mini-blogs that takes trivial-looking problems and discusses how hard these problems can become with larger-scale systems. We are going to start with log tailing. Let's define some concepts first.

Wtf are `logs`?

Even though it looks like something very easy to define, it's not that simple. For the sake of this article we will treat logs as a series of events stored sequentially on a storage medium. Events can be anything: network requests, sensor readings, GPS coordinates of a moving object, series of interactions with a mobile phone etc. In the very core of the definition of logs, data is immutable. It's written only once. In the simplest cases logs are printed out of your console program. If you graduated college and this is a program that will run for more than 1 hour, it's a good idea to have these logs appended to a file.

Now what is `log tailing` exactly?

Given the previous definition of logs, we can deduce that most of the time we are actually interested in the most recent results of the data. And even in cases where you are interested in some data in the past, you often express that data in a function of the present. You are interested in data in the logs that are either "Now", "Recent" or "3 days ago". This all mean there is much more interest in the tail of the log data rather than any other part. The process of log tailing is to fetch the last N events/lines/transactions in a log.

An angry commenter will now yawn and say "Why not just use $ tail"?

How difficult is log tailing?

The very quick answer is, it depends. It depends on the scale of the logs, how they are arranged and how N is a factor in that.

But let's not get ahead of ourselves. Let's discuss how we would tail logs as they grow starting from your console screen.

1- Logs are emitted to your console screen
It's simple. You wrote a cool game and you are logging changes happening in UI. You can already see the tail of the logs very clearly. Duh!

2- Web server serving few hundred active users per day
Unix provides a great tool for log tailing from files. type tail /var/www/log0.txt and you'll have the last 10 lines of that file printed to your console. Assuming your pet photos website got featured on a local lifestyle blog and it's peaking with 10-100s of requests per second, you'd want to see that action without typing tail 10s of times. Option -f or follow will continuously print out newer logs to your screen. Easy right?

3- You've hit your first million users with around 500 QPS
You posted an awesome picture of a cool cat and your website got really famous. Now looking at logs is a little problematic because you need a way to debug an issue happening in production. Neither tail nor tail -f can help you because of the overwhelming number of requests. Files are getting bigger and they need to be rotated

Log rotation: A process of isolating, archiving and compressing old logs.

Logs apocalypse: you have multi-million user base with >500kQPS.

Here is where everything bends. Your web server is not a single ssh-able machine. Your log is not a single file. Answering the question that UNIX tail used to answer "What are the last N events in my log?" is not easy anymore. It's also time to be skeptical and ask questions like "What do we really need from all those logs?" and "How are we going to use the data?" These lazy-sounding questions are actually very pro-active and lead to insight on how to tackle scale problems that come with that amount of throughput.

Now that you have gigabytes (or petabytes, hypothetical traffic is free anyway) of log entries coming in every second, your logs are being distributed across 10s or 100s of machines. You want an optimised system for writing these logs with guarantees on ordering and most importantly consistency. The system must be able to tail logs from hundreds of machines (while maintaining order) efficiently. Kafka and LogDevice are two open-source examples of distributed data stores for logs that are built with the principle of immutable writes and tailing in mind.

What's next?

Go through docs for Kafka and LogDevice to learn about distributed logging. Spin up a local Kafka cluster. I find the Spotify docker image the fastest way to spin up Kafka locally and play with it.

Wtf is rendezvous hashing?

Essam Hassan — Mon, 08 Jun 2020 16:03:39 GMT

This is part of a series of posts explaining cryptic tech terms in an introductory way.

Disclaimer: this series is not intended to be a main learning source. However, there might be follow up posts with hands-on experiments or deeper technical content for some of these topics.

Motivation

Let's start by proposing a problem. Assume you have a system consisting of a serving node that simply takes a request from a client and routes it to the correct back-end node after altering some aspects of that request.

A single node system

First thing you notice that serving node can be a bottleneck if not scaled proportionately to the traffic hitting it. We'll probably need a multi-node serving system. How can we route requests to a cluster of nodes?

Take a moment here and think of potential solutions before scrolling.

Multi-node system

Use all the load balancers!

Me overly excited about load-balancers

Well, maybe. Load balancing is a very good technique to scale up systems. But let's take a deeper look at what that load balancer would look like. Would the load balancer be a single node serving a multi-node system? It seems like load-balancing might be the answer but we'll need more than just a single node load balancing.

Client-based load balancing

That's a good idea. Clients are by nature scaled up to the traffic (they are the traffic). Let's make the client aware of all the serving nodes and let it decide which node to talk to based on some semantics of the client. Let's create a smart client that can hash a request and get a serving node.

smart-client

That's great. We have a client that can know which node to call directly depending on that client's identity. Assuming uniform load between all clients, this should be a good solution.

But what if the client-assigned serving node crashed?

Take a moment here and think of potential solutions before scrolling.

Can we do better?

In traditional hashing you give a one-way method input X (client-id in our example) and it returns H(X) result (serving node address in our example). This is a widely used concept and part of almost all programming languages in a way or another. The shortcoming of that in our example happens when H(X) points to a failing node. This is problematic because the client doesn't know any extra information to fall back to. This is where rendezvous hashing shines. Instead of hashing single input single output, rendezvous hashing takes the form H(Sn, X) = Cn
Where Sn is set of elements (serving nodes in our example), X is the input for hashing (client-id in our example) and returns Cn an ordering of Sn weighted with a complete scoring (think of it as confidence values adding to 1).

Take a moment here and think how this form can be useful in our example.

Using rendezvous hashing gives us a node we should talk to (the top of the list Cn) but it also gives us an ordering for fail-over if the node failed. This is very interesting because it didn't only solve the load balancing fail-over issue but it also established whats called distributed k-agreement. Imagine a write request coming to the nodes, this write need to be replicated across nodes in some way so if the node that received the write request failed, the client can fail over to the next. In our scenario, if the client and serving nodes share the same H(Sn, X), nodes can decide the order of replication and create a replication chain that follow the same order as how the client would fallback. Assume we have a discovery service that basically just keeps track of all the nodes around, all members can distributively adapt to changes in the nodes structure without having to communicate with the other nodes.

wtf series - what the fuzz?

Essam Hassan — Sun, 26 Jan 2020 16:50:42 GMT

Fuzzing is a software black-box testing technique where units of a software system are being continuously tested against a stream of random data. Units can be methods, API endpoints, Database interfaces, etc. The software unit is then monitored for exceptions such as crashes, or potential memory leaks.

Why the fuzz?

The main objective of fuzzing is to get the application to crash. It's not meant to test any business bound logic.

Fuzzers work best for discovering vulnerabilities that can be exploited by buffer overflow, DOS, cross-site scripting and SQL injection. These schemes are often used by malicious hackers intent on wreaking the greatest possible amount of havoc in the least possible time. Fuzz testing is less effective for dealing with errors that does not crash the software.

Fuzzing often reveals serious defects within the software and that are usually overlooked by engineers. When you fuzz test a unit of your software you expose it to "unbiased" input that is not affected by assumptions developers usually assume. Teams maintaining large scale projects, where these crashes are really costly, run fuzz tests continuously either as a qualifier for release or even continuously run against HEAD of code-under-test independent from release process with proper reporting when a crash happen.

There are a few variations of fuzzing-oriented instrumentation techniques available and usually it's recommended to test your software against few of them. The most popular techniques are:

Address sanitization (detects addressability issues)
Leak sanitization (detects memory leaks)
Thread sanitization (detects data races and deadlocks)
Memory sanitization (detects use of uninitialized memory)

How do fuzzers work?

Before going over how fuzzers work, let's start with defining seed data as a corpus of data that represents the skeletal structure of data. Fuzzers rely on seed corpus to derive new inputs to test the software against. While fuzzers usually don't need seed corpus and can run without them, good seed helps in providing a range of possible skeletal structures to enable fuzzers to work more efficiently and achieve coverage faster especially when the structure of the data is complicated i.e a complex JSON object or a deeply nested protocol buffer.

Fuzzing engines offer different interfaces depending on the language. gofuzzer for example, offers an extensive interface for generating random structures. It lets developers decide how to perform the fuzzing.

For example, this snippet uses gofuzzer to generate an object with randomized values of internal values.

type MyType struct {
    A string
    B string
    C int
    D struct {
    	E float64
    }
}

f := fuzz.New()
object := MyType{}
f.fuzz(&object)

It lets you decide your fuzzing logic on your own. For example you can use the previous snippet to fuzz a method against 1000 random inputs.

for i := 0; i < 1000; i++ {
	f.Fuzz(&object)
	FuncToTest(object) # test FuncToTest against 1000 random inputs
}

Other fuzzers like LLVM libfuzzer for C/C++ provide a method interface that is feeded random data, software units are called inside this method with passed data to test against fuzz. The following is an example from the official documentation for libfuzzer. This fails if fuzzer engine passes HI! in data .

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  if (size > 0 && data[0] == 'H')
    if (size > 1 && data[1] == 'I')
       if (size > 2 && data[2] == '!')
       __builtin_trap();
  return 0;
}

As per the docs, if you tried running this using: clang -fsanitize=address,fuzzer file_name.cc && ./a.out will catch this and crash very quickly. This showcases how powerful fuzzers are in catching even this specific corner case. You can use the method LLVMFuzzerTestOneInput to test your code-under-test by calling the method inside it.

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  FuncToTest(Data, Size);
  return 0;
}

BONUS: check this tutorial on how to reproduce heartbleed vulnerability in OpenSSLv1.0.1 using libFuzzer

Resources

LLVM libFuzzer

Google LibFuzzer Tutorial

Google gofuzzer

wtf series - wtf is Linux namespaces?

Essam Hassan — Fri, 02 Aug 2019 14:06:04 GMT

Let's start by running man namespaces

Name:
namespaces - overview of Linux namespaces

Description:
A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the
namespace that they have their own isolated instance of the global resource. Changes to the global resource
are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.

Realizing I'm over using this one

So Wtf is namespaces?

Namespaces (ns for short) is a Linux kernel feature that allows creating a logical view of system resources that's different from the physical resources a system has. This is the core idea of Containers like docker, rkt and LXC.

A simple idea of how Namespaces work can be derived backward from it's applications. Let's take a docker container that runs a Nodejs server. If you do docker exec -it /bin/bash and then ps aux you'll find processes running into container having PIDs 1,2,3. Which usually collides with running ps aux on your terminal. This is possible because of one of Linux namespaces, the PID namespace. It isolates the process ID number space. This means two processes on the same host can have the same PID if they are on different PID namespaces.

This concept of resource isolation is really important in containers. Imagine running two containers on a host machine without this isolation. ContainerA could simply kill -9 $PID from ContainerB or unmount a disk that ContainerC depends on. BONUS: MNT namespace.

It's worth nothing that namespaces don't limit resource usage. It controls visibility of resources between processes. BONUS#2: Wtf is cgroups?

7 namespaces of Ice and Fire

I feel bad explaining my word play with images.

MNT - isolate filesystem mount points
UTS - isolate hostname and domainname
IPC - isolate interprocess communication (IPC) resources
PID - isolate the PID number space
NET - isolate network interfaces
USR - isolate UID/GID number spaces
Cgroup - isolate cgroup root directory

wtf series - wtf is chroot?

Essam Hassan — Sun, 28 Apr 2019 04:24:51 GMT

Let's starting by typing man chroot in a linux terminal and see what we get

Name:
chroot - run command or interactive shell with special root directory.
Usage:
chroot [OPTION] NEWROOT [COMMAND [ARG]...]
~ Linux manual

So what really is chroot?

Chroot is a unix command that changes the root directory of COMMAND to NEWROOT specified in parameters. A very simple use case is: chroot ~/Downloads ls -la

Looking at the previous command and the definition of chroot one would say that this command will:
1- change root to Downloads, cd to Downloads
2- run ls -la inside Downloads and print out Downloads directory content

While this thinking process is correct, this command will actually fail for two reasons. The first reason this command will fail is because chroot needs root access and will fail without root permissions. If this didn't happen to you then you probably copy pasted a command from the internet that you don't understand and ran it with root permissions and if I were you I would spend sometime thinking about my life choices.
The second reason, after fixing the root permissions, is that the shell won't be able to execute ls -la in the new root directory because it won't find it under /bin/ls and this demonstrates a powerful feature of chroot. Chroot does more than a simple cd into a directory. Chroot changes what / means for the running process. In our example the shell will look for ls in ~/Downloads/bin/ instead of / because chroot changed the root to be under Downloads directory.
Let's try this one more time but after making ls available in the new root.

mkdir -p ~/Downloads/bin/
cp /bin/ls /Downloads/bin


// make sure to make all dependencies available too
// for simplicity, will make all shared libs available but for better context use `ldd` to get dependencies of a specific command.
sudo cp -a /usr ~/Downloads
sudo cp -a /lib ~/Downloads
sudo cp -a /lib64 ~/Downloads



sudo chroot ~/Downloads ls -la

Now that we made ls available to the new root living in ~/Downloads/bin/ls, command will output the content of Downloads directory.

You might have noticed that COMMAND argument is optional in chroot. By default, chroot runs $SHELL in the new root which will also fail if you don't have the shell executable available under NEWROOT.

But why?

Testing and development: A test environment can be set up in chroot for software that would otherwise be too risky to deploy on a production system.

Dependency control: Software can be developed, built and tested in a chroot populated only with its expected dependencies. This can prevent some kinds of linkage skew that can result from developers building projects with different sets of program libraries installed.

Compatibility: Legacy software or software using a different ABI must sometimes be run in a chroot because their supporting libraries or data files may otherwise clash in name or linkage with those of the host system.

Recovery: Should a system be rendered unbootable, a chroot can be used to move back into the damaged environment after bootstrapping from an alternate root file system (such as from installation media, or a Live CD).

Worth noting: if the topic got your interest and you went to search for more content about the topic you might find people refering to chroot as chroot jail and this is slightly incorrect because chroot can't be used as a security measure. It simulates a separate environment but a process that knows it's running under a chroot environment can escape the environment.

Sources

Wikipedia
Linux Man
My terminal?

wtf series - wtf is protobuf?

Essam Hassan — Wed, 27 Feb 2019 13:19:45 GMT

This is part of a series of posts explaining cryptic tech terms in an introductory way.

Disclaimer: this series is not intended to be a main learning source. However, there might be follow up posts with hands-on experiments or deeper technical content for some of these topics.

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. [src]

History

XML
One of the oldest data serialization standards driven from the SGML, the Standard Generalized Markup Language. Standardized 1996~1998, XML was the primary structured and semi-structured data serialization standard and the basis for SOAP protocol. It's human-readable, structured and very verbose. [snippet src]




    Belgian Waffles
    $5.95
    
   Two of our famous Belgian Waffles with plenty of real maple syrup
   
    650


    Strawberry Belgian Waffles
    $7.95
    
    Light Belgian waffles covered with strawberries and whipped cream
    
    900


    Berry-Berry Belgian Waffles
    $8.95
    
    Belgian waffles covered with assorted fresh berries and whipped cream
    
    900


    French Toast
    $4.50
    
    Thick slices made from our homemade sourdough bread
    
    600


    Homestyle Breakfast
    $6.95
    
    Two eggs, bacon or sausage, toast, and our ever-popular hash browns
    
    950

JSON
Short for JavaScript Object Notation, popularized around early 2000s was a step forward for data representation and serialization as it was less verbose than XML, easier to support on browsers, faster to process and it enforces consistent structure. This made it the main data serialization standard for the modern web for long years.

// simple representation of a breakfast menu with only one item
[
  {
    "name": "Homestyle Breakfast",
    "price": "$6.95",
    "description": "Two eggs, bacon or sausage, toast, and our ever-opular hash browns",
    "calories": 950
  }
]

Protocol buffers (protobuf)

Google developed Protocol Buffers for use in their internal services. It is a binary encoding format that allows you to specify a schema for your data using a specification language, like so:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;
}

The Protocol Buffers specification is implemented in various languages: Java, C, Go, etc. are all supported, and most modern languages have an implementation. Here is a Java example using the previous schema:

Person john = Person.newBuilder()
    .setId(1234)
    .setName("John Doe")
    .setEmail("jdoe@example.com")
    .build();
output = new FileOutputStream(args[0]);
john.writeTo(output);

Person john;
fstream input(argv[1],
    ios::in | ios::binary);
john.ParseFromIstream(&input);
id = john.id();
name = john.name();
email = john.email();

Using protocol buffers has many advantages over plain text serializations like JSON and XML:

Very dense data which result in very small output and therefore less network overhead
Declared schema makes parsing from most languages very straightforward with less boilerplate parsing code
Very fast processing
Binary encoded and hard to decode without knowledge of the schema
Backward compatibility as a side-effect

References

Google developers - Protocol Buffers

wtf series - wtf is pstree?

Essam Hassan — Sun, 24 Feb 2019 22:29:27 GMT

This is part of a series of posts explaining cryptic tech terms in an introductory way.

Disclaimer: this series is not intended to be a main learning source. However, there might be follow up posts with hands-on experiments or deeper technical content for some of these topics.

I wrote this mini-blog because of the many people who asked me what does the name of the blog mean. I was surprised because it's something I use on a daily basis and I just assumed every programmer will know it.

pstree is a unix command-line tool that prints tree of processes. Unlike ps, it prints the hierarchy of processes and not only list the processes. If you specify a username as an argument, it trims out any process not owned by that user. The result is usually a list of subtrees of the original tree.

But, what can it do?

let's start with the syntax

pstree [-a, --arguments] [-c, --compact] 
       [-h, --highlight-all, -Hpid, --high‐light-pid pid] [-g] --show-pgids] 
       [-l, --long] [-n, --numeric-sort] [-p, --show-pids] [-s, --show-parents] 
       [-u, --uid-changes] [-Z, --security-context] 
       [-A, --ascii, -G, --vt100, -U, --unicode] [pid, user]

if you do man pstree in your terminal you will find the previous snippet and guide for all the basic functionality of pstree so I'll skip all that and go to a very specific use case where you can benefit from it

#1: You are building a program that forks processes for specific purpose and kill them afterwards. You want to debug the scenarios and see if the forked processes life-cycle is handled correctly

"Meh. You can always use ps" ~ angry commenter

Using pstree you can view the process hierarchy and check that it's deleted correctly without leaving zombie children processes. This is a very common pitfall when you try to fork a process from your main program and then kill it later. Most languages use a basic SIGKILL and in many cases the spawned process dies leaving out orphan processes often called zombie processes

A very simple way for debugging these scenarios is using pstree and grep on the process name during your program execution to make sure the process lifecycle is handled correctly. One way to ensure killing processes correctly is to use setgid and kill the process group containing the forked process and any children processes.

References

Man pstree(1)

wtf series - wtf is a goroutine?

Essam Hassan — Sun, 20 Jan 2019 19:03:58 GMT

This is part of a series of posts explaining cryptic tech terms in an introductory way.

Disclaimer: this series is not intended to be a main learning source. However, there might be follow up posts with hands-on experiments or deeper technical content for some of these topics.

Before explaining what goroutines are, let's define some concepts first

Concurrency: The ability for different parts of a computer program to execute in out-of-order or in partial order without affecting the end result of that program. Concurrency is not parallelism.

Parallelism: The simultaneous execution of different parts of a computer program across a number of cores improving overall performance. Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.

Goroutines

Goroutines are a way of doing tasks concurrently in golang. They allow us to create and run multiple methods or functions concurrently in the same address space inexpensively. They are lightweight abstractions over threads because their creation and destruction are very cheap as compared to threads, and they are scheduled over OS threads. Executing the methods in the background is as easy as prepending the word go in a function call. Golang achieves parallelism by multiplexing goroutines onto multiple OS threads so if one should block, such as while waiting for I/O, others continue to run. Their design hides many of the complexities of thread creation and management and delegates that to the go runtime scheduler.

There is a number of differences between goroutines and native threads:

Memory Consumption: Goroutines don't require much space (~2kb of stack space) as they grow by allocating memory from the heap space. Threads are greedy they need ~1MB of memory and a guard page which you can think of like a wall between threads memory pools
Setup/Teardown: Greedy threads cost significantly more during setup/teardown as they request resources from OS and have to free those after they finish executing. Because goroutines are created and terminated by the runtime, they are very cheap to create and destroy.
Context Switching: When a thread block, on I/O for example, another thread has to take it's place. This operation is called context switching and during the context switching the OS has to save the thread state which is ALL the registers (ballpark ~40)[don't know an exact number, don't hesitate to message me if you have a more accurate number], program counter, stack pointer and co-processor state. Goroutines are scheduled cooperatively and when a switch occurs, only 3 registers need to be saved/restored - Program Counter, Stack Pointer and Data registers(DX). The cost is much lower.
Goroutines have channels and wait groups, primitive structures for communication. One google search will let you know how much harder doing that in native threads is.

Quick Hands-on: Fetch json from a list of urls in goroutines and only output the result when all the goroutines are finished

// not recommended, just here to demonstrate how channels work and for the sake of completeness.
Channels version:

func main() {
  urls = string[]{
    "url1",
    "url2",
    ...
  }
  done := make(chan bool) // We don't need any data to be passed
  responses := make(chan string) // save responses coming from urls

  for _, url := range urls {
      go func(url) {
          responses <- fetchUrls(url)
          done <- true // signal that the routine has completed
      }(url)
  }

  // Since we started len(urls) routines, receive len(urls) messages.
  // this will block main routine til all are recieved
  for i := 0; i < len(urls); i++ {
      <- done
  }
    
  for i := range responses { // treating the channel as a range
      fmt.Println(i)
  }
}

Waitgroups version:

func main() {
  urls = string[]{
    "url1",
    "url2",
    ...
  }
  responses := make(chan string) // save responses coming from urls
  var wg sync.WaitGroup
  wg.Add(len(urls)) // increment the counter of the waitgroup by len(urls)

  for _, url := range urls {
      go func(url string) {
          defer wg.Done() // defer keyword execute the given method after the scope is terminated, wg.Done() decrements the counter of the waitgroup
          responses <- fetchUrl(url)
      }(url)
  }

  go func() {
    for response := range responses { // treating the channel as a range
      fmt.Println(response)
    }
  }

  wg.Wait() // blocks til the counter reaches zero
}

What's next?

Keywords:
buffered vs unbuffered channels, reusing golang waitgroups, non-blocking channel operations, channel timeouts, worker pools, ticker, mutexes, atomic counters, user space threads vs kernel threads.

References

Rob Pike - Concurrency is not parallelism

Effective Go - Goroutines

Go by example

wtf series - wtf is cgroups?

Essam Hassan — Wed, 09 Jan 2019 09:59:22 GMT

This is the first of a series of posts explaining cryptic tech terms in an introductory way.

Disclaimer: this series is not intended to be a main learning source. However, there might be follow up posts with hands-on experiments or deeper technical content for some of these topics.

Cgroups

Cgroups, or control groups, are Linux kernel features that provide mechanisms allowing processes to be organized into hierarchical groups whose usage of various types of resources ( ram, cpu, disk i/o, network i/o ) can then be limited and monitored. There are two major versions of cgroups with some differences in groups' hierarchies and the fact that multiple cgroups per process is not supported in cgroupsv2 unlike cgroupsv1. We will be discussing only cgroupsv2 here with little references to v1.

Cgroups configuration structure usually looks like this
* assuming it's mounted at /sys/fs/cgroup/

/sys/fs/cgroup/
            cgroup1/
                cgroup3/
            cgroup2/
                cgroup4/

This means rules/limits assigned to cgroups1 affects both cgroups1 processes and the processes of children cgroups of cgroups1, cgroups3 in this example.

It should be noted that not all cgroup controllers are available in the v2 yet as it was only marked available in kernel version 4.5. Currently, memory, io, rdma, pids, perf_event and cpu controllers are available.

Example use cases

- Limiting network access for a process so that it can't connect to network. net_io
- Prioritizing disk I/O from high priority processes like database servers and/or static servers. blkio in cgroupsv1 io in cgroupsv2
- Apply inbound/outbound network firewalls to a certain cgroup. net_cls in cgroupsv1 and xt_cgroup iptable filter in cgroupsv2
- Prioritizing outgoing network traffic for web crawlers and content aggregator processes. net_prio and xt_cgroup iptable filter in cgroupsv2
- Monitor CPU usage of all sideloaded tools. cpu_acct

Quick hands-on: Setting network I/O priority higher for MySQLdb server

Setting up cgroups

Install libcgroup2 package on your distro. It can take different names in different distributions.

Mount cgroup2 to /cgroups

$ mount -t cgroup2 nodev /cgroups

Creating cgroups

$ mkdir /cgroup2/highiopriority

This creates a cgroup called highiopriority. Now we want to add io to cgroup.subtree_control to be able to use its controller.

$ echo "+io" > /cgroups/cgroup.subtree_control

Now if we printed cgroup.controller file for the highiopriority cgroup, we should find io enabled.

$ cat /cgroups/highiopriority/cgroup.controllers
io

Now we can change io.weight [10,1000] to give our new cgroup higher io priority than default value 100 by adding the new weight to /cgroups/highiopriority/io.weight

echo "1000" > io.weight

Assigning processes to cgroups

Now that we have a cgroup with high io priority, the only missing thing is adding processes to this cgroup. To do so, we should add pids of the processes we want to prioritize their io, mysqldb server in our case, to cgroup.procs file in the cgroup directory.

cat /var/lib/mysql/{yourservername}.pid > /cgroups/highiopriority/cgroup.procs

Now we have mysql io prioritized over default cgroup. There are so many interesting things you can do with cgroups on your own workstation. My recommendation is: try to extend this tutorial with limiting your default browser RAM usage and instead of figuring out the pid this old fashioned way, search of a way on how you can auto capture an executable pid and assign it to a certain cgroup on the fly.

What's next

The next couple of topics I have in the queue are all Linux, Golang and some networks. I'm also planning to work on a mini-docker project implementing some of linux concepts that I will be explained in the coming topics to build a mini-docker engine and experiment with different kernel features. Please feel free to post suggestions and feedback here https://curiouscat.me/essamhassan

A practical guide for evaluating software engineering job offers

Essam Hassan — Thu, 27 Dec 2018 14:44:37 GMT

You go with the biggest pay cheque. It's that simple, right? Wrong.

Don't get me wrong, your salary is very important and you should never compromise on that. However, I believe that not spending enough time on your job decisions directly affects your overall job satisfaction when your job stops meeting your inflated optimistic expectations. I'll try to walk through all the details that matter to me when I'm evaluating job offers. In some sections you'll find a "no-no" list, which is basically the list of things I believe are a hard pass for me if any company had them. Please remember that this is how I approach the topic and it's not necessarily the best approach. Before going into details, let's go through some quick FAQs around the topic.

1- Do I really have to go through your boring post even though I only have one offer?
A little offended but Ok. YES, you should. You have to know what you are walking into even if it's your only offer to adjust your expectations to reality. Or even better, walk away from an offer that will harm your career and be worse than a few extra months of job hunting.

2- Does that guide apply to everyone and every company?
These points are deliberately generalized to be a fits-all model and of course any kind of generalization have flaws. So, read with caution that your country, company, offer, etc might not match 100% of the criteria but still be good enough. For example, not all companies offer equity, but that doesn't mean that companies that does not are worse. You have to put in mind the market in that country, other compensation factors, your skillset, etc and make a mindful choice.

3- What are the numbers beside some sub categories?
These are weights that sum up to 10 per category. A category with less than 7 points is a bad indicator. One with less than 4 is ehrab ya yaseen.

Whether it is a feedback, question or if you feel something is missing or wrong: https://curiouscat.me/essamhassan and I'll consider adding it here.

Company

Upper Management [3]
You should learn about the management strategy and goals. How are there actions conforming with their promises and claimed visions. Good signs are long consistent leadership with little conflicts and a series of delivered promises. Bad signs are inflated executive salaries, unpredictable reorgs, insider trading gossip, record of ethical or legal misconduct.
My no-no list:
- Companies with history of unpredictable layoffs of large number of employees. Yes, you Uber. Reference: https://bit.ly/2CuTD26
- Companies with irresponsible or corrupt Management. Uber, again but now Tesla too. References (just a few..): https://bit.ly/2Q0Fb5s https://nyp.st/2ShFvyT https://bit.ly/2MZqwuZ

Performance [3]
By performance, I mean both stock value performance and financial performance for non-IPO companies. Look at their numbers and see if they maintain consistent good performance. You don't have to be an analyst to do that. I like this guide as it's simple, and they don't use lots of cryptic financial terms: https://bit.ly/2T2nvbA

Culture [4]
This is a very hard topic to tackle, because for big companies it's almost always depending on where you work in these companies. And in startups it's not easier at all. Because startups mature at a much faster rate than big companies, startups' cultures can change in months, even weeks from the heaven workplace to a terrible place to work or vice versa. You have to be able to find your type. Not all people like relaxed cultures and relaxed goals. Some people want something fast and high velocity. Others want something formal and strict ( which is not necessarily a bad thing ). So the first step in this point is to find what your type is. If you look at the feedbacks on Glassdoor you'd be surprised of the people who left your dream companies because they weren't 'serious enough' or that they weren't working on challenging enough projects or just the fact that people didn't get any work done.

My no-no list:
- Companies with toxic blame culture, where you'll be afraid to make a mistake in your code and this fear will end up crippling you and make it harder for you to make impact and grow in the company.
- Companies with no well-defined cultures. Those who let mid-level managers drive the culture of the company and not the opposite. It takes one bad manager for the company to sink into toxicity losing most of their talent and gaining a very bad reputation for it. Most engineers with a couple years of experience will be able to relate to this specific type.

Career

Role [3]
Is it really what you want to do? Look at the description and communicate efficiently with your prospect manager. Don't "figure it out on the spot". Recruiters usually sell to you that you can move teams if you didn't like your first team. This is true, but not the complete truth. Yes, you can, but doing that very early is a serious bad sign on your profile. The outcome can range from a very bad feedback stuck to your profile with this company forever to an immediate firing decision.

Manager [4]
YES, your next manager should be one of the most important factors in your decision making process. Not only in this category but overall. Your first manager will help you navigate your future in a company, prep you for your role and communicate their feedback clearly with you. It's very tricky to get insights about your manager but you can do so by either asking fellows in that teams, if possible, or having calls with them asking them questions that can reveal as much as possible about their management style. The Joel Test is one of the very insightful list of questions to start that conversation with. The programmer's bill of rights is less appropriate for a manager call but very good for your online research of the company.

My no-no list:
- Passive aggressive managers.
- Technically incompetent managers.
- Unethical managers.
- Managers who have conflict of interest with your career growth*** This is very important. Say you are going to work for a company and the manager said something like "we need someone to maintain our legacy system, because it's in a bad shape, and we have been searching for someone to maintain it for a year now". Now, the phrase means a couple of things. First, it's a legacy system, there will be no updates to it or feature changes. This means no challenges and therefore halted promos. Also, It's against the manager's interest for you to switch to something else. It's a plus negative sign if the stack they are using for the legacy system is a popular stack "Java" it basically means no one wanted the job.

Growth [3]
It's really hard to measure your growth in a company before joining but you can still rely on a couple of factors. One of the things to consider, most good companies have a well defined criteria for promotions in the form of a job ladder guide. It's usually well-defined making it highly unlikely for you to be denied a promotion if you are meeting the publicly shared guide. Ask your recruiter/manager about your potential for promotions.

But again, promotions are not the only kind of growth you get in a company. Skills growth is a very important factor. Learn how the company invests in you and your skills.

My no-no list:
- Companies with vague unclear selective promotions.
- Companies with low demand in upper levels. It's usually much harder to get promoted or even get the chance to work on challenging exciting projects if they have an influx of senior engineers. And even though you'll be surrounded with senior engineers, and you'll be able to learn from them, I believe the cons outweigh the pros.

Contract

Beware: too many underlined words for very valid reasons ahead.

There are a couple of points you should check while reading any offer letter. Aside from the basics like checking the title, role, salary, bonus, etc, you should read clauses about working hours, holidays and sick leaves, causes of termination, start/end dates and notice period.

The most important thing in this section is to check for any restrictive clauses. Things like non-compete, non-solicitation, non-dealing, non-poaching, no-second-job, etc. These can have serious implications on your future and you should make sure you are okay with whatever is in your contract, otherwise revise it with your recruiter.

My no-no list:
- Very restrictive clauses like long non-compete especially if it might conflict with my future interest.

Compensation

Sub-categories in italic are optional*

Base: cash you get every month in the bank and usually presented in yearly format.
Base is always within a fixed range per level. You should research your company and country compensation average and most importantly the average skills in that specific level. Compare your research output with your offer and you'll end up with a clear idea if you are getting paid enough. Another important tip if you are moving to a new country/state, most countries tax people differently based on many factors. Make sure you learn your net salary and how it compares to the cost of living of your new city, if you are moving.

My no-no list:
- Offers with very low bases in exchange of "a chance to learn". This usually happens with startups with tight budget and high potential. The rule is if you are going to do the work, you have to be paid the money.

Stocks: company stock grants to employee usually distributed over four years.
Those are optional and most companies don't offer them, BUT you should expect your offer to include stocks/stock options, if the company is an early to mid stage startup. Stocks are usually a standard part of job offers from companies like Amazon, Google, Facebook and Apple. It's a way of retaining talent as you usually receive them over four years. The vesting schedule, which is the schedule for you to receive your shares, is usually a standard 25/25/25/25. However, It still differs from a company to another. Amazon, for example, offers a different 5/15/20/40. Schedules have more details than that so you should be aware of the plan you are on and make sure you are ok with it. Also, stocks are usually a very good negotiation point, unlike base, and you can always try to push for 20-30% if you think you deserve it.

Signing Bonus: one time bonus you get when you join the company.
Optional signing bonus is usually given for candidates as a standard offering or to make the offer more competitive compared to other offers. A good strategy to evaluate the signing bonus is to divide it on 4 years and add to your total compensation. We'll talk more about that at the end of this section.

Yearly Bonus: % of your base salary given out yearly on a prorated basis.
Bonuses are pretty standard. Worth noting that most companies have a performance multiplier which means if you are initially offered 15% bonus, your final bonus can be between 1 x 15% to 2 x 15%. Of course these numbers are not accurate and will differ depending on the company. Feel free to ask your recruiter about these numbers.

Yearly Raise: % increase of your base salary usually based on performance.
This is pretty standard as well but usually not clear enough on your package. You should ask about the raise and make sure it's within the acceptable ranges. I've had encounters with companies that offer very low yearly raise so unless you promote, your salary won't be really changed. Make sure you are ok with your yearly raise.

Refreshers:
The word refreshers usually means additional stocks given out after your initial stocks. Not all companies have refreshers and companies have different schedules and reasons for giving refreshers. Know if your potential employer offers any kind of stock refreshers and when those refreshers are given.

Benefits:
Yes, benefits are not just perks. They directly translate to your compensation. A company that offers free food adds a surplus of money not spent on food to your base. A company supporting your transportation ticket adds that amount of money on your base as well. Make sure to have this input very clear to you when you are making a decision. Also, be critical about these benefits. A company can offer 20+ benefits on their site but all can end up a very small surplus to your income and that you'd have been better going with the company with the higher base fulfilling the Tom Cruise dream.

Total compensation: It gets really overwhelming boiling down all these in your salary so I have this formula that can help you do it. And since this is specifically for software engineers:

* expected years at that company, ** costIdx is cost index of the country you are moving to [optional], *** yearly.bonus.percent not raise

Keep in mind that this is a made up formula put together by a very sleepy person late in the night, and it's a little different to the well-known total compensation calculations on the internet so treat it with a grain of salt.

At the end

Take your time, consider everything in this post but most importantly consider what matters to you before making up your mind. Good luck!

15 pieces of advice I wish I had been given before graduating computer science

Essam Hassan — Wed, 19 Dec 2018 22:17:21 GMT

I’m not sure if I’m even qualified to give advices but lets say these are based on my top mistakes

1- Be open to learning everything even the tools and technologies you don’t like.

2- The best way to learn a new thing is to use it.

3- Learn as many languages as you want BUT always have this one language you excel at. And by excel at I mean a language your googling/typing ratio with it is 1 Search / 100 LOC or less

4- Read books and fat references. Get the concrete big picture and not only the “Up and Running” version

5- Invest alot of time planning for your first job.

6- Ask for feedback frequently and use those feedbacks

7- Learn about the market needs

8- Take good care of your resume. Develop it iteratively and take industry feedback whenever you can

9- Algorithms and Data Structures is 50% boost in your career and sometimes more. So Practice, practice, practice

10- Work life balance is not a choice. By overworking you are not necessarily over achieving

11- Avoid Imposters and beware of falling into the dunning kruger effect. Always remember, you are not there yet.

12- Ask others about how they achieved what they achieved. Get insights from your peers and ask for feedback.

13- Networking and maintaining good work relations are not luxuries. Career is not only about being a good coder.

14- Maintain a Trello board (or basically any other board) of your plans and achievements. Monitor your performance

15- Eat healthy and workout. Your health is important not only for your productivity but also for your overall happiness. Nobody wants to spend the day at hospitals

The unusual guide for cracking interviews at Google, Facebook, Amazon, Cloudflare, Bloomberg and others

Essam Hassan — Sun, 16 Dec 2018 11:42:42 GMT

Introduction

This guide is intended for people pursuing software engineering careers in high-end tech companies. I will be discussing my encounters with these companies in a number of verticals: Resumes, Preparations and Interviews. I'll try to make this post as beneficial as possible and not yet another "binge on problems" advice. Feel free to ask/message me about anything. Whether it is a feedback, question or if you feel something is missing or wrong: https://curiouscat.me/essamhassan

Resumes

Many people believe that passing the interview is the hardest part. But if you looked into the numbers, passing the screening phase is much more problematic as many factors contribute to your odds that are irrelevant to your experience and technical expertise. i.e referrals, resume format, average level of other candidate at the time of screening, hiring season, etc.

In this section I like to divide these companies into two categories, even though there will still be variance between companies in the same category. The first category is companies hiring for specific roles in specific teams. They look for highly specialised people. In my case, this was with Cloudflare. One of my favourite interview processes. The Title was Distributed Systems Engineer - Data Pipelines Infrastructure. The process was all specific around this title and the team I'll be potentially joining. More to that in the next section.
How to pass their screening? Customise your resume to show off your most relevant experiences to what they need. In my case, they wanted someone with Kafka and Elasticsearch knowledge and I happened to work and interacted with both on a relatively advanced level @Instabug so I mentioned it in my resume. Add the most relevant experiences but also remember not to exaggerate or 'oversell' yourself because you'll be expected to meet the resume expectations in the interviews.

The second category is for companies that hire pre-allocation. This means you get hired then find a team to get matched with. This is usually the case with Facebook, Google, Amazon and Bloomberg. In Google and Amazon, you get matched after passing all the interviews. With Facebook and Bloomberg they even go further and make the matching after your orientation/training.
How to pass their screening? In this category, hiring is usually focused on high level technical skills. They are looking for someone who 1- had past technical impact on his team, 2- demonstrates problem solving skills on abstract levels. Because most of the time, they don't know where you'll end up working and they want you to have skills that are as generic as possible. Mentioning that you are a Laravel expert won't be really attractive. On the other hand, mentioning that you wrote multiple services with multiple languages will definitely help. This shows you are language agnostic and that you, to a certain degree, have good design skills. Also these companies rely heavily on referrals. Try to get a referral from your network and don't be shy to reach out to older colleagues or 2nd degree friends. If they think you are good it's in their benefit to refer you as they get referral bonus for every passing candidate they refer.

Preparation

The same categories that reflect on resumes do reflect on your preparation afterward. When a Google recruiter think that your resume looks good and demonstrate good generic problem solving and design skills, your interviews performance should reflect those skills. So for these companies ( Amazon, Bloomberg, Google, Facebook ) I'd recommend working on those two aspects.

1- Generic problem solving skills
You are expected to be able to tackle medium-hard programming problems very fast in your preferred language. and to make sure your solution is correct by testing it. Usually you are put into this test multiple times with multiple conditions. i.e IDE, Phone, Chromebook, Whiteboard. My recommendations are
- Practice on tackling the problems you are not familiar with not just memorising solutions and pattern matching. In my Google interview which was my last, I solved problems from all the popular books and was done with leetcode and yet all the problems were new to me. So I wouldn't advice memorising problems. You should be able to tackle problems you've not seen before.
- Learn your language very well. Some interviewers might tolerate lack of language knowledge but you can't bet your luck on that.
- Book recommendations: Cracking the Coding interview of course is one of the good books to prepare from but the problems in there are relatively easier than Google and Facebook level, make sure you don't limit yourself to that difficulty. Also don't miss out on the soft skill advices mentioned on how to tackle interviews situations. Elements of programming interviews is also a very nice book with richer technical content but it's only focus on technical questions.
- In these books you'll find them mentioning baseline skills that every candidate should be familiar with. Be really comfortable implementing and using all of these data structures and algorithms. Many problems are usually a twist on a well known algorithm or data structure and usually interviewers expect you not to struggle implementing the basic idea.

2- Design skills
These skills are usually tested by two types of questions. Either you'll be asked to design a separate system with certain requirement. You should expect follow up questions changing the requirements and should design the system to be flexible on changing requirements. The other type of questions comes in the format of a follow up question for a problem solving question. Expect questions like "How to scale that solution if we have 1000x input" or "How to shard this". You should be familiar with concepts like sharding, load balancing, inverse indices, replication, etc. If you have industry experience with large scale systems you should not find an issue with their questions but if you are not, your best bet is reading Grokking the system design interview and other design books to understand distributed systems basics.

3- Role specific skills
For companies hiring for role specific skills, in My case this happened with Cloudflare (Data Pipelines Infra) and HelloFresh (Golang). Your preparation should be derived from 1- Job description 2- Recruiter Call. As as many questions about the role requirement and what they do. In my case, Cloudflare was looking for someone with advanced Unix experience, deep Kafka understanding, Elasticsearch internals. These are the information I asked about in the recruiter call. You should always ask about what they do and what problems they have. This will reflect on what they are looking in an Engineer. In Cloudflare I asked about "How they use Elasticsearch" and it turned out they don't. They wanted someone familiar with the internals of elasticsearch as they have similar infrastructure.

With HelloFresh it was different, they were interested in someone with good Golang experience. I wouldn't have known that as their interview offered Javascript, PHP and Go as options even though they were looking specifically for Go experts. Asking questions in early phases will let you know what they are exactly looking for. Keep in mind that recruiters are your advocates. It's in their interest for you to succeed and pass the interviews and they can help you better understand the requirements.

Interviews

1- Amazon
I interviewed with Amazon twice. In the first time I passed the phone interview and then they delayed the process due to headcount. So I had to do the process again and did another phone interview and 5 rounds of onsite interviews. Both phone interviews and 4 out of 5 of the onsite interviews were the same format. The interviewer starts by giving you a vague problem for you to solve and then 20 mins behavioural interview focused on Amazon leadership principles https://www.amazon.jobs/en/principles

There was one interview that was slightly different. It was focused on Object Oriented Design ( Not distributed systems ) and the requirements got incrementally tougher as we went through the design. It's good to start with a design that you can extend later and make room for flexibility in your designs.

How to prepare? Problem solving, Amazon leadership principles, Object Oriented Design.
What they focus on? leadership skills, ability to communicate solutions clearly

2- Bloomberg
Bloomberg had one of the longest interviews I've ever had. All technical interviews followed the same format. 1-2 problem solving questions. Except my first interviewer asked a little bit about my background but I felt it was a warm up question not really contributing to the process. If you find the problem too easy, finish your solution quickly because there will probably be another one.
How to prepare? Problem solving
What they focus on? -

3- Cloudflare
Cloudflare interviews were my favourite. The first round was a Hacker Rank screening task followed by a screening phone call. The screening phone call was an interesting conversation about scalability and "all role related". It was a mix between introducing me to the role and figuring out if I can have a good conversation about their infrastructure. The engineer proposed an infrastructure problem that they had and asked me if I have any ideas how this problem can be solved. It was not a question-answer format but rather a water cooler conversation to get a sense of how deep I can go in details about certain topics. The second set of interviews were proposing problems of high scale nature. I can't disclose the question but they were very different from all the other companies. It combined problem solving with extreme code optimisation. I was allowed to google things and read documentations.
How to prepare? Read the role description carefully, Know what skills they expect and get as much information as possible in the first call. Refresh your memory about the basics of every skill mentioned and make sure you have good command of the language you choose.
What they focus on? High volume network/data, scalability and distributed systems

4- Google
I had the regular Google process that is well known on the internet with slightly different experience. It started with a recruiter screening call which was 'not an interview' but I ended up being asked technical questions about red black trees and golang internals. Then had a phone interview with one moderate problem. After a week I went through a loop of interviews with problems varying in difficulty. Some of the interviews included follow up systems questions in the format of "How to scale this" and "How to shard that". Google Interview questions were relatively harder than the rest of the interviews.
How to prepare? Expect not known questions. All my questions were new and I've never seen before which was surprising given that I solved a relatively big number of problems. Don't rely on memorising solutions or even pattern matching. Make sure you practise solving new problems. Also questions had twists in the description so pay attention to the description and ask clarifying questions about anything you think is unclear.
What they focus on? The main focus in the interviews was problem solving with interest in scale and low level knowledge.

5- Facebook
My facebook process was cut short because of headcount issues I had to pause the process. The interviews were similar to Google process without the first screening call. The problems were clearer and a little bit more familiar. I believe they look for the same type of Engineers as Google and preparing for these interviews should be the same.

Conclusion

If there’s one advice I could give to people, it would be to take action. I think that this is what separates those who get the offers they really want and those who don’t. Taking the steps and actually preparing and applying to the companies they want to get in. Many people think it’s impossible for them, or that they are not good enough, but this is not true. If you work hard, prepare, apply, and repeat, you are eventually going to get the offers you dream of. - Mahmoud Shokry

I do believe that most of the readers who think they are not good enough for these companies are just overwhelmed with the stigmas caused by media and engineers benefiting from the exclusivity. Take action and know that hard work pays off.

Resources

Cracking the Coding Interview, 6th Edition
LeetCode.com
InterviewBit.com
Pramp.com
LeetCode Mock Interviews

pstree

Macros are great; ffs stop using macros.

Wtf are macros?

Macros are great! (Sort of.)

The dark side of macros

1. Debugging macros is pain incarnate

2. Macros break syntax highlighting and tooling

3. Macros introduce hidden complexity

4. Better alternatives exist

When should you actually use macros?

Conclusion

Further readings

What people don't get about decentralization

Intro

New word: Polysemy

Decentralization of Opportunity

Why it's not fair for you to trade on the stock market today?

Decentralization of Initiative

Decentralization of Governance

A glimpse into the future of Web 3.0

That's it for today.

Wtf is Gossip Protocols?

Motivation

Use all the health checks!

Use all the heart beats!

So, what do we do then?

Continuous communication

Neighboring v. Random

Networking Protocol

Disadvantages of using Gossip Protocols

Who uses Gossip?

Further readings

Simple problems at scale: log tailing

Wtf are logs?

Now what is log tailing exactly?

How difficult is log tailing?

Logs apocalypse: you have multi-million user base with >500kQPS.

What's next?

Wtf is rendezvous hashing?

Motivation

Use all the load balancers!

Client-based load balancing

Can we do better?

wtf series - what the fuzz?

Why the fuzz?

How do fuzzers work?

Resources

wtf series - wtf is Linux namespaces?

So Wtf is namespaces?

7 namespaces of Ice and Fire

wtf series - wtf is chroot?

So what really is chroot?

But why?

Sources

wtf series - wtf is protobuf?

History

Protocol buffers (protobuf)

References

wtf series - wtf is pstree?

But, what can it do?

#1: You are building a program that forks processes for specific purpose and kill them afterwards. You want to debug the scenarios and see if the forked processes life-cycle is handled correctly

References

wtf series - wtf is a goroutine?

Goroutines

Quick Hands-on: Fetch json from a list of urls in goroutines and only output the result when all the goroutines are finished

What's next?

References

wtf series - wtf is cgroups?

Cgroups

Example use cases

Quick hands-on: Setting network I/O priority higher for MySQLdb server

Setting up cgroups

Creating cgroups

Assigning processes to cgroups

What's next

Further readings:

A practical guide for evaluating software engineering job offers

You go with the biggest pay cheque. It's that simple, right? Wrong.

Company

Career

Wtf are `logs`?

Now what is `log tailing` exactly?