BLASTing Linux Code

Jan Tobias Mühlberg and Gerald Lüttgen
Department of Computer Science, University of York, York YO10 5DD, U.K.


main page | next example


Commit Overview | Files | Comments

Checking Locking Properties: Example 2


Commit Overview

Commit Key fe2e17a405a58ec8a7138fee4ebe101858b636e0
Subject [PATCH] dpt_i2o fix for deadlock condition
Description Miquel van Smoorenburg forwarded me this fix to resolve a deadlock condition that occurs due to the API change in 2.6.13+ kernels dropping the host locking when entering the error handling. They all end up calling adpt_i2o_post_wait(), which if you call it unlocked, might return with host_lock locked anyway and that causes a deadlock.
Requires Linux 2.6.14 kernel source as from git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/linux-2.6.14.y.git

--- a/drivers/scsi/dpt_i2o.c
+++ b/drivers/scsi/dpt_i2o.c
@@ -660,7 +660,12 @@ static int adpt_abort(struct scsi_cmnd *
msg[2] = 0;
msg[3]= 0;
msg[4] = (u32)cmd;
- if( (rcode = adpt_i2o_post_wait(pHba, msg, sizeof(msg), FOREVER)) != 0){
+ if (pHba->host)
+ spin_lock_irq(pHba->host->host_lock);
+ rcode = adpt_i2o_post_wait(pHba, msg, sizeof(msg), FOREVER);
+ if (pHba->host)
+ spin_unlock_irq(pHba->host->host_lock);
+ if (rcode != 0) {
if(rcode == -EOPNOTSUPP ){
printk(KERN_INFO"%s: Abort cmd not supported\n",pHba->name);
return FAILED;
@@ -697,10 +702,15 @@ static int adpt_device_reset(struct scsi
msg[2] = 0;
msg[3] = 0;
+ if (pHba->host)
+ spin_lock_irq(pHba->host->host_lock);
old_state = d->state;
d->state |= DPTI_DEV_RESET;
- if( (rcode = adpt_i2o_post_wait(pHba, msg,sizeof(msg), FOREVER)) ){
- d->state = old_state;
+ rcode = adpt_i2o_post_wait(pHba, msg,sizeof(msg), FOREVER);
+ d->state = old_state;
+ if (pHba->host)
+ spin_unlock_irq(pHba->host->host_lock);
+ if (rcode != 0) {
if(rcode == -EOPNOTSUPP ){
printk(KERN_INFO"%s: Device reset not supported\n",pHba->name);
return FAILED;
@@ -708,7 +718,6 @@ static int adpt_device_reset(struct scsi
printk(KERN_INFO"%s: Device reset failed\n",pHba->name);
return FAILED;
} else {
- d->state = old_state;
printk(KERN_INFO"%s: Device reset successful\n",pHba->name);
return SUCCESS;
}
@@ -721,6 +730,7 @@ static int adpt_bus_reset(struct scsi_cm
{
adpt_hba* pHba;
u32 msg[4];
+ u32 rcode;
pHba = (adpt_hba*)cmd->device->host->hostdata[0];
memset(msg, 0, sizeof(msg));
@@ -729,7 +739,12 @@ static int adpt_bus_reset(struct scsi_cm
msg[1] = (I2O_HBA_BUS_RESET<<24|HOST_TID<<12|pHba->channel[cmd->device->channel].tid);
msg[2] = 0;
msg[3] = 0;
- if(adpt_i2o_post_wait(pHba, msg,sizeof(msg), FOREVER) ){
+ if (pHba->host)
+ spin_lock_irq(pHba->host->host_lock);
+ rcode = adpt_i2o_post_wait(pHba, msg,sizeof(msg), FOREVER);
+ if (pHba->host)
+ spin_unlock_irq(pHba->host->host_lock);
+ if (rcode != 0) {
printk(KERN_WARNING"%s: Bus reset failed.\n",pHba->name);
return FAILED;
} else {

(purple: line numbers and function names; red: line removed; green: line added)

Files

Unmodified sources

Comments

The problem in this case study is well explained above and the fixes are straightforward. We only want to add, that the trouble-causing call of spin_lock_irq() on the host lock can be found in line 1172 of the source file provided.

Tracing this error shaped up as being straightforward, too. However, the use of pointers and complex data structures (i.e. pHba->host->host_lock) forced us to use simple integer variables instead of the spinlock_t pointers.

Source: http://www.kernel.org



Jan Tobias Mühlberg, $Date$