Reliability Engineer
390 jobs found
Job Position | Company | Posted | Location | Salary | Tags |
---|---|---|---|---|---|
Shardeum Foundation | Remote | $90k - $112k | |||
Launchpadtechnologiesinc | Remote | $103k - $117k | |||
IO Global | Remote | $126k - $132k | |||
Impossible Cloud | Hamburg, Germany | $103k - $117k | |||
Learn job-ready web3 skills on your schedule with 1-on-1 support & get a job, or your money back. | | by Metana Bootcamp Info | |||
ZetaChain | Remote |
| |||
xLabs | Buenos Aires, Argentina | $72k - $100k | |||
Ledger | Paris, France | $94k - $148k | |||
Launchpadtechnologiesinc | Remote | $185k | |||
Fireblocks | Get a Fireblocks Platform Demo | $98k - $150k | |||
Ledger | Paris, France | $120k - $156k | |||
Gemini | Remote | $172k - $215k | |||
Nethermind | Remote | $112k - $156k | |||
Fmr | Bangalore, India | $105k - $120k | |||
Coinbase | Remote | $211k - $249k | |||
Alchemy | Bucharest, Romania | $80k - $85k |
About The Role:
The Engineering team at Shardeum is responsible for delivering the Shardeum Mainnet, and developing the smart contract platform, the consensus layer and the protocol layer. We focus on building scalable, performant, secure and reliable software that can be downloaded by thousands of node operators to actualize the Shardeum network.
We are in search of a highly talented, innovative Senior Site Reliability Engineer to join our team. If you enjoy solving complex computer science problems, are passionate about what you work on, are a perfectionist who wants to build things the right way, and are persistent about finishing what you started, then you're the kind of person we are looking for. You will be working with equally talented and innovative individuals with the goal of building incredible software with the potential to change the world.
What You'll Be Doing:
- Enhance observability, reliability, and availability by defining and tracking key performance metrics
- Tackle complex technical challenges, improving systems and products to make them more reliable, easier to deploy, and simpler to operate and diagnose
- Automating routine tasks and processes to enhance efficiency
- Collaborate with the infrastructure team to fine-tune and optimize deployments
- Take ownership of security, scalability, operational integrity, and architectural clarity in the design and implementation of systems
- Developing and implementing disaster recovery and business continuity strategies for blockchain infrastructure
- Designing and managing network topologies to ensure optimal performance, security, and scalability
- Creating and enforcing best practices, standards, and policies for blockchain infrastructure management
What We're Looking For:
- Strong experience in Site Reliability Engineering (SRE)
- Proficiency in programming languages such as Python, Scala, Go, JavaScript, Node.js, or Typescript
- Exceptional troubleshooting abilities, with a keen eye for identifying and resolving potential issues proactively
- Experience with monitoring and alerting tools
- In-depth experience managing Linux-based infrastructures and performing Linux/Unix system administration
- Strong understanding of networking fundamentals and protocols
- Experience with distributed systems and software architecture
- Familiarity with infrastructure management principles and tools
- Outstanding communication skills, capable of explaining complex technical concepts to both team members and management
- Bachelor’s degree in Computer Science, Engineering, or a related field
We'd Love If You Have:
- Experience with blockchain protocols (Ethereum, Bitcoin, Cosmos, etc.) and consensus mechanisms
- A desire to contribute to the growth of our blockchain ecosystem
- Experience contributing to open source projects
- Written articles or created videos on technical topics, especially related to blockchain
- Read the Bitcoin and Ethereum whitepapers
What does Reliability Engineer do?
A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization
They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance
Here are some of the typical tasks and responsibilities of a Reliability Engineer:
- Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
- Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
- Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
- Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
- Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
- Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.