我们在aws中有一个自动缩放组,启动时数据库连接超时。这是一个zend framework 1应用程序,在ubuntu 16.04.3 lts上使用PHP7.0.22-0ubuntu0.16.04.1
代码被烘焙到ami中,但在启动过程中,userdata脚本将从git中提取并配置应用程序。数据库域多年来没有改变,配置主要是考虑使用哪个elasticache示例。换句话说,baked-in代码已经配置了数据库,并且在configure步骤中被相同的值覆盖。
一旦ec2示例在elb中,它就开始受到攻击 /health-check
查看负载平衡器是否正常。此控制器内有以下代码:
public function healthCheckAction() {
try {
/* @var $DBConn Zend_Db_Adapter_Pdo_Mysql */
$DBConn = Zend_Registry::get('multidb')->getDb();
// test guide service (most likely will be from memcache, unlikely to hit db)
$guideService = $this->_apiGuideService();
$guideService->isLoaded();
// this line fails and throws an exception
// I put host in here just so an error would include it in throw during this phase instead of catch phase (where it works)
// test raw db connection
$dbh = new PDO("mysql:host={$DBConn->getConfig()['host']};dbname={$DBConn->getConfig()['dbname']}", $DBConn->getConfig()['username'], $DBConn->getConfig()['password']);
$data = $dbh->query("SELECT '{$DBConn->getConfig()['host']}' as host, now()")->fetchObject();
// test database connectivity
// I put host in here just so an error would include it in throw during this phase instead of catch phase (where it works)
$sql = "SELECT '{$DBConn->getConfig()['host']}' as host, now()";
$DBConn->fetchRow($sql);
// test cache
/* @var $cache Zend_Cache_Core */
$cache = Zend_Registry::get('cachemanager')->getCache('default');
if (!$cache->load('health_check')) {
$cache->save(true, 'health_check');
}
echo 'Instance is healthy';
}
catch (Exception $e) {
header('HTTP/1.1 500 Internal Server Error');
echo 'Instance is unhealthy';
// get instance id
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://169.254.169.254/latest/meta-data/public-ipv4');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// get instance ip
$ip = curl_exec($ch);
curl_setopt($ch, CURLOPT_URL, 'http://169.254.169.254/latest/meta-data/instance-id');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$instance = curl_exec($ch);
// email us some info
$message = "Instance $instance failed health check. ssh ubuntu@$ip to investigate<br><br>" . $e->getLine() . " " . $e->getCode() . "<br>" . $e->getMessage() . "<br>" . $e->getTraceAsString(). "<br><br>";
ob_start();
// this works and returns access denied, not timeout
$this->runCommand('mysql -u examplecom_platform -h sg-rds-example.us-east-1.rds.amazonaws.com');
echo "testing DB with php<br>";
try {
echo "write host: " . $DBConn->getConfig()['host'] . "<br>";
echo "read host: " . $DBConn->getConfig()['host'] . "<br>";
$dbh = new PDO("mysql:host={$DBConn->getConfig()['host']};dbname={$DBConn->getConfig()['dbname']}", $DBConn->getConfig()['username'], $DBConn->getConfig()['password']);
$data = $dbh->query('select now()')->fetchObject();
echo "query database without zend:<br>";
print_r($data);
// this line works and prints out
// stdClass Object
// (
// [now()] => 2018-01-09 14:47:12
// )
$dbh = null;
} catch (PDOException $e) {
print "Error: " . $e->getMessage() . "<br/>";
}
// this all work/show correct IP
$this->runCommand('nc -vz sg-rds-example.us-east-1.rds.amazonaws.com 3306');
$this->runCommand('host sg-rds-example.us-east-1.rds.amazonaws.com');
$this->runCommand('dig sg-rds-example.us-east-1.rds.amazonaws.com');
$debug = ob_get_contents();
ob_end_clean();
$message .= "<br><br>" . str_replace("\n", "<br>", $debug);
$mail = new Zend_Mail();
$mail->setSubject('[examplecom] Instance Failed Healthcheck v2')
->setFrom('noreply@example.com')
->addTo('alerts@example.com')
->setBodyHtml($message)
->send();
}
}
当我不断调试东西时,我会添加越来越多的东西来测试连接
这个 try
语句抛出错误 SQLSTATE[HY000] [2002] Connection timed out
但这种完全相同的联系在 catch
并且能够查询 now()
从数据库。
这就是我被难住的地方,同一个进程在第一次连接时超时,但在错误捕捉过程中却能工作?
另外,我只会收到1或2的这些电子邮件说,它不能连接,但最终的时候,我可以登录测试一些东西,它是工作和连接良好。健康检查报告快乐,示例保存在elb中。
有什么想法或建议添加更多的调试?
暂无答案!
目前还没有任何答案,快来回答吧!