Amazon Aurora: A production Horror story

We moved to Amazon Aurora to save costs, lower replication time and to help our database performance.  We planned to shut down read only $slaves during the evening. During the day we would add these read only slaves back.

Before Aurora “Cluster”

With vanilla mysql master-slave setup the code had to know which servers were read only slaves. Historically, we would specify a list of read only servers:

 ['hostnames'][] => 'aurora-rds.slave1.us-east-1.rds.amazonaws.com',
 ['hostnames'][] => 'aurora-rds.slave2.us-east-1.rds.amazonaws.com',
 ['hostnames'][] => 'aurora-rds.slave3.us-east-1.rds.amazonaws.com',
 ['hostnames'][] => 'aurora-rds.slave4.us-east-1.rds.amazonaws.com',
...

To then create a connection to a reader we would pick a read instance at random.

// Create connection
$rand = rand(0,count($hostnames['hostnames']) - 1); 
$conn = new mysqli($servername, $username, $password, $hostnames['hostnames][$rand]);

The problem 1: nothing expands and contracts

The specified servers for reading do not expand and contract.  If load dramatically increases, you would need to add new servers through the RDS console then add the host strings to your $hostnames. Due to size, adding servers typically would take us 2-4 hours.

On the other hand, reducing the load would required us to carefully drop machines and push code to remove the connection strings.

Solution, Amazon Aurora: Read only cluster

We were excited to see amazon aurora offered a read only connection string.  Instead of specifying the read db database strings:

 ['hostnames'][] => 'aurora-rds.slave1.us-east-1.rds.amazonaws.com',
 ['hostnames'][] => 'aurora-rds.slave2.us-east-1.rds.amazonaws.com',
 ['hostnames'][] => 'aurora-rds.slave3.us-east-1.rds.amazonaws.com',
 ['hostnames'][] => 'aurora-rds.slave4.us-east-1.rds.amazonaws.com',
...

We could now just use one read only connection string

Amazon Aurora Clusters Production

As you add Aurora slaves, they are added to the pool of machines for that one read only connection string.  You can click for details to see your one connection string.

Under cluster details you will get a connection string, sweet we now have it

 ['hostnames'][] => 'aurora-rds.cluster.read-1.rds.amazonaws.com'

Amazon Aurora: Let’s start dropping databases

Well. It was late at night load was light. Let’s see what happens if we nuke a reader. What could go wrong? We deleted a machine:

Screen-Shot-2016-10-30-at-10.06.08-AM

With the machine deleting, we thought since it was a ‘cluster’ amazon would just redirect the reads to another machine.  We continued to high-five.

Shit just got real.

Ok.  PageDuty page comes in.  Sweating. Second page comes inApparently, amazon continues to send read requests to a bad slave.

amzon-aurora-prodution-stop-reader

 Our site starts to timeout.

Screen Shot 2016-10-25 at 11.54.31 AM

Soon within 5 minutes, the site stabilized.

Thankfully, things cleared after 10 minutes.  So **warning** amazon aurora clusters are not meant to act like a cluster.

Purchasing a SIM card in ITALY

While using your American phone in Italy is simple, it is expensive. Unreliable Wi-Fi can make the reality of keeping in touch via a service like WhatsApp more frustrating than it should be.

Purchasing a SIM in Italy is a great way to avoid overage charges from your cell phone carrier. It will ensure you don’t get stuck with a bill and you can use up to 4G of data (10/2016).

If you anticipate using your phone data, consider traveling with a mobile phone fitted with a Italian SIM card.

What you will need
1) A PAPER CLIP.
2) 20 Euros for 4G (10/2016) + Finding a TIM store
3) Your Phone
4) Your passport

I went with a SIM card from a company TIM for 20Euros with 4G. After around 20 texts, I was unable to text. HOWEVER, I was able to keep my 4G of data. With that I was able to load slack and whatsapp.

THE PROCESS
1) Find a TIM store
2) They will require your passport.
3) Demand they put it in the phone WHEN YOU ARE IN THE STORE. Don’t walk far from this store for 60 minutes.
4) After 30 minutes it should register. After 30 minutes restart your phone. IF it still does not work, go back to the store.

SIM registration is required by the Italian government prior to service activation. This can be done using your passport when purchasing a SIM card. More information on the regulatory requirements can be found here at www.italycellphone.com

Below is a picture of what you will be given.  Ask them to TAPE YOUR CURRENT sim card to this card.

Slack for iOS Upload-4

 

HAPPY TEXTING

Installing FFMpeg on Ubuntu 14.04

I was trying to cut some video files in Ubuntu for a photo project of mine. Ubuntu’s 14.04 apt-get cannot install ffmpeg out of the box. This is what occurs when you try to use apt-get:

Installing FFMpeg on Ubuntu 14.04

Digging around for a few hours I came across this easy command:

   sudo add-apt-repository ppa:mc3man/trusty-media
   sudo add-apt-repository ppa:jon-severinsson/ffmpeg
   sudo add-apt-repository ppa:kirillshkrogalev/ffmpeg-next

Enjoy!

Git: Force reverting to a previous commit

Reverting to a previous commit in git. These commands seem to always get lost on the interwebs.

# Reset the index and working tree to the hash
git reset --hard 23h8f32378

# Move the branch pointer back to the previous HEAD
git reset --soft HEAD@{1}

git commit -m "Revert to 23h8f32378"

AWS EC2: Moving /var to EBS

From Tamas at work! not mine! :0)

Moving var to /mnt on aws. Often if you have the default storage size set when you create a new instance it is 8gig. To move your var director to another mount point:

rm -rf /mnt/*
cd /var
cp -ax * /mnt/
cd /
mv var var.old
mkdir var
umount -l /dev/xvdb
mount /dev/xvdb /var
nano /etc/fstab # make sure it will automount after restart
reboot
rm -rf /var.old

Twilio is rad: But don’t get ripped off

Twilio allows software developers to programmatically send and receive SMS/MMS messages using its web service APIs. Twilio is pretty simple and reliable. However, it is expensive so knowing how to keep the COSTS DOWN is helpful.

Long codes vs Short codes: Pick long codes.

From Twilio’s documentation:

Long codes are meant for person-to-person communications, and can send only 1 message per second. For high-volume, application-driven messaging, Twilio recommends using a short code.

This is probably the quickest way to run into a big Twilio bill.  Long codes are cheaper.   Twilio will recommend Short codes.  You want to use Long codes.  If you think you need more bandwidth BUY ANOTHER PHONE LINE.  Phone lines are cheap 1$ per month.   Below is the price difference between Short and Long codes, it is quite substantial.  To send a text via long code is $0.0075 vs $0.01 short code. Screen Shot 2016-05-16 at 3.08.13 PM

 The Infinite loop [and the emptying of your account]

The SMS log is a great thing to check OFTEN.  This log contains all information related to sent sms messages.  Here is a classic example of a loop spotted in these logs.   Twilio is able to reply to text messages.  They can also send text messages.  SO, what if, Twilio messages itself?  The following illustrates this point.  They will not detect nor stop these loops.  They can cost a fortune if not spotted quickly. Screen Shot 2016-05-16 at 2.18.45 PM This programming log can be found: sms logs

Country Codes: Why are people in Vietnam using your service?

Texting in 3rd world counties can be expensive. Hackers from those counties can often find holes in your software allowing them to send 3rd party texts using your account. The quickest way to save cash is to disable other countries in the Twilio console. Screen Shot 2016-05-16 at 3.22.55 PM

Twilio: YOU ARE CHARGED FOR ALL INBOUND TEXTS.

When a user reply’s to one of your text messages sent by Twilio – they will make a REST call to your SMSUrl defined for that phone number.  YOU WILL GET CHARGED.  Remove this  return SMSUrl from all your phone numbers. Screen Shot 2016-05-16 at 3.34.33 PM To ENSURE you are not being charged for return texts remove the SmsUrl.  Here is an example of removing the SMSUrl from my entire account.

<?php
// RUN: composer require first
require_once("./vendor/autoload.php");

// Your Account Sid and Auth Token
$sid = ""; 
$token = ""; 
$client = new Services_Twilio($sid, $token);

// Get an object from its sid. If you do not have a sid,
// check out the list resource examples on this page
foreach ($client->account->incoming_phone_numbers as $number) {
        echo "\n". $number->phone_number;
        $number->update(array(
        "VoiceUrl" => "",
        "SmsUrl" => ""
    ));
}
?>

You pay more than what is stated

You can see from the earlier paragraph that the cost per sms send is either $0.0075 or $0.01. BUT. Twilio will up the price depending on the number of texts it made for that ONE PARTICULAR SMS API CALL. In the picture below you can see a “2″ next to the number of texts sent. We only made one API call WTF?

Why or how it creates multiple texts will stump anyone. But, we believe it has to do with some carriers not supporting concatenated messages. Twilio needs to do its part and clearly breakdown why this is the price and WHY we sent two texts.


Screen Shot 2016-05-17 at 3.48.03 AM

Adding mysql columns to a large table

Changing a large Mysql table has become much easier. I just wanted to document using the pt-online-schema-change command. This following example adds a new column to a Users table.

# pt-online-schema-change  h=api.rds.amazonaws.com,t=Users,u=xxx
 --database xxx  --ask-pass --alter "ADD COLUMN (last_ip varchar(40))"   
--nocheck-replication-filters --critical-load Threads_running=100  --execute

Enter MySQL password:
Found 7 slaves:
  ip-10-xxx
  ip-10-xxx
  ip-10-xxx
  ip-10-xxx
  ip-10-xxx
  ip-10-xxx
  ip-10-xxx
Will check slave lag on:
  ip-10-xxx
  ip-10-xxx	
	....

# 8 software updates are available:
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.
#   * The current version for MySQL Community Server (GPL) is 5.6.24.

Operation, tries, wait:
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `xxx`.`Users`...
Creating new table...
Created new table xxx._Users_new OK.
Waiting forever for new table `xxx`.`_Users_new` to replicate to ip-xxxx...
Altering new table...
Altered `xxx`.`_Users_new` OK.
2016-03-30T13:55:11 Creating triggers...
2016-03-30T13:55:11 Created triggers OK.
2016-03-30T13:55:11 Copying approximately 3827686 rows...
Copying `xxx`.`Users`:  10% 04:18 remain

Securing Jenkins Ubuntu 12.04.5 LTS

Jenkins by default allows everyone to see your jobs.   Securing jenkins is pretty easy:

0) Add two arguments to JENKINS_ARGS in /etc/default/jenkins

# –argumentsRealm.passwd.$ADMIN_USER=[password]
# –argumentsRealm.roles.$ADMIN_USER=admin

This should be near the end of the file.  Once changed, restart, jenkins.

1) Install https://wiki.jenkins-ci.org/display/JENKINS/Role+Strategy+Plugin

2) Enable the plugin by going to the secure area:

a) http://YOURDOMAIN:PORT/configureSecurity/

b) Click:

3) Restart Jenkins.

4) Under configuration settings http://YOURDOMAIN:PORT/manage

Click on Manage Roles (could have changed, basically anything with roles)

Add a new group called “Anonymous” and uncheck everything. Then you want to add another group called “authenticated” and check everything. Jenkins will immediately prompt you for a login this way.

vi /var/lib/jenkins/config.xml Screen Shot 2016-01-08 at 8.46.07 AM

CDN Comparison: Edgecast, S3 and Cloudfront in San Francisco

Recently we decided to compare CDNs.  Using NewRelic we setup a test on Synthetics.  Pretty interesting data.

Edgecast All counties

Edgecast Just San Francisco

EdgeCast seems to have occasional long hangs even in San Francisco.

 

S3 Just San Francisco

S3 seems to be slow, but very consistant

CloudFront Just San Francisco

Cloudfront has the greatest speed bursts, however, suffers from a few long pulls.

s3-parallel-put: Move files to AWS S3 Fast

Today I used s3-parallel-put to send files to S3.   The directory I was working with contained millions of small files.  Using the standard s3cmd with the sync option never seemed to finish and without any error messages. With s3-parallel-put I can push files in parallel even controlling the number of processes.

Using the command:

python /usr/bin/s3-parallel-put
--content-type=guess 
--processes=30 
--verbose  
--bucket=[YOUR BUCKET] 
 /uploads/ >> /tmp/backup/log.txt 2>&1

The only tricky part here is the “guess” option. This basically tells AWS to guess the content type of the object you are uploading. AWS needs this information when it retrieves the object. Web browsers do most of the retrieving and they want headers! (which include content-types).

Also in the examples in the github project there is a “PREFIX”. I still have no idea what it is.

https://github.com/twpayne/s3-parallel-put

You may need to install boto, if so this is what I did (Using Linux):

  875  2015-05-18_12:41:14 sudo easy_install pip
  874  2015-05-18_12:41:24 pip install boto
  876  2015-05-18_12:41:43 sudo pip install boto